WASWiki: WAS Scalability and Performance

Overview and key concepts

This chapter provides a conceptual overview of the goals and issues associated with scaling applications in the WebSphere environment. It also presents the various techniques that can be used to implement scaling. A short section highlights the performance improvements of WebSphere Application Server V6 over previous versions (see 1.5, “Performance improvemens over previous versions” on page 31). A prerequisite for all WebSphere scalability and workload management concepts discussed in this redbook is the use of IBM WebSphere Application Server Network Deployment V6 or IBM WebSphere Business Integration Server Foundation V6. We assume the use of IBM WebSphere Application Server Network Deployment throughout this book.

WebSphere Application Server V6 Scalability and Performance

1.1 Objectives

As outlined by the product documentation, a typical minimal configuration for a WebSphere Application Server V6 installation is illustrated in Figure 1-1. There is a single Web (HTTP) server and a single application server, and both are co-resident on a single machine. Please note that such a configuration is not recommended for security reasons; there is no DMZ established. This is true for both internal (intranet) and external (Internet) applications. Naturally, the

performance characteristics of this setup are limited by the power of the machine on which it is running, by various constraints inherent in the configuration itself, and by the implementation of WebSphere Application Server.

we are interested in defining system configurations that exhibit the following properties:

Scalability

Workload management

Availability

Maintainability

Session management

Performance impacts of WebSphere Application Server security

WebSphere

Application Server

Configuration

Repository

(XML Files)

HTTP Web

Server

Application

Database

Web Servlet EJB

Client

Admin

Service

Overview and key concepts

Note that scaling the application server environment does not help if your

application has an unscalable design. For Web application and EJB development

best practices, refer to Chapter 16, “Application development: best practices for

application design, performance and scalability” on page 895 and to the white

paper WebSphere Application Server Development Best Practices for

Performance and Scalability, found at:

http://www.ibm.com/software/webservers/appserv/ws_bestpractices.pdf

Scalability

Scalability defines how easily a site will expand. Web sites must expand,

sometimes with little warning, and grow to support an increased load. The

increased load may come from many sources:

New markets

Normal growth

Extreme peaks

An application and infrastructure that is architected for good scalability makes site growth possible and easy. Most often, one achieves scalability by adding hardware resources to improve

throughput. A more complex configuration, employing additional hardware, should allow one to service a higher client load than that provided by the simple basic configuration. Ideally, it should be possible to service any given load simply by adding additional servers or machines (or upgrading existing resources). However, adding new systems or processing power does not always provide a linear increase in throughput. For example, doubling the number of processors in your system will not necessarily result in twice the processing capacity. Nor will adding an additional horizontal server in the Application Server tier necessarily result in twice the request serving capacity. Adding additional resources introduces additional overhead for resource management and request distribution. While the overhead and corresponding degradation may be small, you need to remember that adding n additional machines does not always result

in n times the throughput. Also, you should not simply add hardware without doing some investigation and possible software tuning first to identify potential bottlenecks in your application or any other performance-related software configurations. Adding more hardware may not necessarily improve the performance if the software is badly designed or not tuned correctly. Once the software optimization has been done,

then the hardware resources should be considered as the next step for improving

performance. See also section 14.2, “Scalability” of the redbook IBM WebSphere

V6 Planning and Design Handbook, SG24-6446 for information about this topic.

6 WebSphere Application Server V6 Scalability and Performance Handbook

There are two different ways to improve performance when adding hardware:

vertical scaling and horizontal scaling. See “Scalability” on page 21 for more

information about these concepts.

Performance

Performance involves minimizing the response time for a given request or transaction.

The response time is the time it takes from the moment the user initiates a request at the browser to the moment the result of the HTML page returns to the browser. At one time, there was an unwritten rule of the Internet known as the ”8 second rule.” This rule stated that any page that did not respond within eight seconds would be abandoned. Many enterprises still use this as the response time benchmark threshold for Web applications.

Throughput

Throughput, while related to performance, more precisely defines the number of concurrent transactions that can be accommodated. For example, if an application can handle 10 customer requests simultaneously and each request takes one second to process, this site has a potential

throughput of 10 requests per second.

Workload management

Workload management (WLM) is a WebSphere facility to provide load balancing and affinity between application servers in a WebSphere clustered environment. Workload management can be an important facet of performance. WebSphere uses workload management to send requests to alternate members of the cluster. WebSphere also routes concurrent requests from a user to the application server that serviced the first request, as EJB calls, and session state

will be in memory of this application server. The proposed configuration should ensure that each machine or server in the configuration processes a fair share of the overall client load that is being processed by the system as a whole. In other words, it is not efficient to have one machine overloaded while another machine is mostly idle. If all machines have roughly the same capacity (for example, CPU power), each should process a roughly equal share of the load. Otherwise, there likely needs to be a provision for workload to be distributed in proportion to the processing power available on each machine.

Overview and key concepts

Furthermore, if the total load changes over time, the system should automatically adapt itself; for example, all machines may use 50% of their capacity, or all machines may use 100% of their capacity. But not one machine uses 100% of itscapacity while the rest uses 15% of their capacity.

In this redbook, we discuss both Web server load balancing using WebSphere

Edge Components and WebSphere workload management techniques.

1.1.3 Availability

Also known as resiliency, availability is the description of the system’s ability to

respond to requests no matter the circumstances. Availability requires that the

topology provide some degree of process redundancy in order to eliminate single

points of failure. While vertical scalability can provide this by creating multiple

processes, the physical machine then becomes a single point of failure. For this

reason, a high availability topology typically involves horizontal scaling across

multiple machines.

Hardware-based high availability

Using a WebSphere Application Server multiple machine configuration

eliminates a given application server process as a single point of failure. In

WebSphere Application Server V5.0 and higher, the removal of the application

dependencies on the administrative server process for security, naming and

transactions further reduces the potential that a single process failure can disrupt

processing on a given node. In fact, the only single point of failure in a

WebSphere cell is the Deployment Manager, where all central administration is

performed. However, a failure at the Deployment Manager only impacts the

ability to change the cell configuration and to run the Tivoli Performance Viewer

which is now included in the Administrative Console; application servers are

more self-sufficient in WebSphere Application Server V5.0 and higher compared

to WebSphere Application Server V4.x. A number of alternatives exist to provide

high availability for the Deployment Manager, including the possible use of an

external high availability solution, though the minimal impact of a Deployment

Manager outage typically does not require the use of such a solution. In many

cases, manual solutions such as the one outlined in the article “Implementing a

Highly Available Infrastructure for WebSphere Application Server Network

Deployment, Version 5.0 without Clustering” provides suitable availability. This

article is available at

http://www-128.ibm.com/developerworks/websphere/library/techarticles/0304_a

lcott/alcott.html

See the redbook WebSphere Application Server Network Deployment V6: High

availability solutions, SG24-6688 for further details on external HA solutions.

8 WebSphere Application Server V6 Scalability and Performance Handbook

Failover

The proposition to have multiple servers (potentially on multiple independent

machines) naturally leads to the potential for the system to provide failover. That

is, if any one machine or server in the system were to fail for any reason, the

system should continue to operate with the remaining servers. The load

balancing property should ensure that the client load gets redistributed to the

remaining servers, each of which will take on a proportionately slightly higher

percentage of the total load. Of course, such an arrangement assumes that the

system is designed with some degree of overcapacity, so that the remaining

servers are indeed sufficient to process the total expected client load.

Ideally, the failover aspect should be totally transparent to clients of the system.

When a server fails, any client that is currently interacting with that server should

be automatically redirected to one of the remaining servers, without any

interruption of service and without requiring any special action on the part of that

client. In practice, however, most failover solutions may not be completely

transparent. For example, a client that is currently in the middle of an operation

when a server fails may receive an error from that operation, and may be

required to retry (at which point the client would be connected to another, still

available server). Or the client may observe a pause or delay in processing,

before the processing of its requests resumes automatically with a different

server. The important point in failover is that each client, and the set of clients as

a whole, is able to eventually continue to take advantage of the system and

receive service, even if some of the servers fail and become unavailable.

Conversely, when a previously failed server is repaired and again becomes

available, the system may transparently start using that server again to process a

portion of the total client load.

The failover aspect is also sometimes called fault tolerance, in that it allows the

system to survive a variety of failures or faults. It should be noted, however, that

failover is only one technique in the much broader field of fault tolerance, and that

no such technique can make a system 100 percent safe against every possible

failure. The goal is to greatly minimize the probability of system failure, but keep

in mind that the possibility of system failure cannot be completely eliminated.

Note that in the context of discussions on failover, the term server most often

refers to a physical machine (which is typically the type of component that fails).

However, we will see that WebSphere also allows for the possibility of one server

process on a given machine to fail independently, while other processes on that

same machine continue to operate normally.

HAManager

WebSphere V6 introduces a new concept for advanced failover and thus higher

availability, called the High Availability Manager (HAManager). The HAManager

enhances the availability of WebSphere singleton services like transaction

Chapter 1. Overview and key concepts 9

services or JMS message services. It runs as a service within each application

server process that monitors the health of WebSphere clusters. In the event of a

server failure, the HAManager will failover the singleton service and recover any

in-flight transactions. Please see Chapter 9, “WebSphere HAManager” on

page 465 for details.

1.1.4 Maintainability

Maintainability is the ability to keep the system running before, during, and after

scheduled maintenance. When considering maintainability in performance and

scalability, remember that maintenance periodically needs to be performed on

hardware components, operating systems, and software products in addition to

the application components.

While maintainability is somewhat related to availability, there are specific issues

that need to be considered when deploying a topology that is maintainable. In

fact, some maintainability factors are at cross purposes to availability. For

instance, ease of maintainability would dictate that one should minimize the

number of application server instances in order to facilitate online software

upgrades. Taken to the extreme, this would result in a single application server

instance, which of course would not provide a high availability solution. In many

cases, it is also possible that a single application server instance would not

provide the required throughput or performance.

Some of the maintainability aspects that we consider are:

Dynamic changes to configuration

Mixed configuration

Fault isolation

Dynamic changes to configuration

In certain configurations, it may be possible to modify the configuration on the fly

without interrupting the operation of the system and its service to clients. For

example, it may be possible to add or remove servers to adjust to variations in

the total client load. Or it may be possible to temporarily stop one server to

change some operational or tuning parameters, then restart it and continue to

serve client requests. Such characteristics, when possible, are highly desirable,

since they enhance the overall manageability and flexibility of the system.

Mixed configuration

In some configurations, it may be possible to mix multiple versions of a server or

application, so as to provide for staged deployment and a smooth upgrade of the

overall system from one software or hardware version to another. Coupled with

the ability to make dynamic changes to the configuration, this property may be

used to effect upgrades without any interruption of service.

10 WebSphere Application Server V6 Scalability and Performance Handbook

Fault isolation

In the simplest application of failover, we are only concerned with clean failures of

an individual server, in which a server simply ceases to function completely, but

this failure has no effect on the health of other servers. However, there are

sometimes situations where one malfunctioning server may in turn create

problems for the operation of other, otherwise healthy servers. For example, one

malfunctioning server may hoard system resources, or database resources, or

hold critical shared objects for extended periods of time, and prevent other

servers from getting their fair share of access to these resources or objects. In

this context, some configurations may provide a degree of fault isolation, in that

they reduce the potential for the failure of one server to affect other servers.

1.1.5 Session state

Unless you have only a single application server or your application is completely

stateless, maintaining state between HTTP client requests also plays a factor in

determining your configuration. Use of the session information, however, is a fine

line between convenience for the developer and performance and scalability of

the system. It is not practical to eliminate session data altogether, but care

should be taken to minimize the amount of session data passed. Persistence

mechanisms decrease the capacity of the overall system, or incur additional

costs to increase the capacity or even the number of servers. Therefore, when

designing your WebSphere environment, you need to take session needs into

account as early as possible.

In WebSphere V6, there are two methods for sharing of sessions between

multiple application server processes (cluster members). One method is to

persist the session to a database. An alternate approach is to use

memory-to-memory session replication functionality, which was added to

WebSphere V5 and is implemented using WebSphere internal messaging. The

memory-to-memory replication (sometimes also referred to as “in-memory

replication”) eliminates a single point of failure found in the session database (if

the database itself has not been made highly available using clustering

software). See 1.4.1, “HTTP sessions and the session management facility” on

page 25 for additional information.

1.1.6 Performance impact of WebSphere Application Server security

The potential to distribute the processing responsibilities between multiple

servers and, in particular, multiple machines, also introduces a number of

Tip: Refer to the redbook A Practical Guide to DB2 UDB Data Replication V8,

SG24-6828, for details about DB2 replication.

Chapter 1. Overview and key concepts 11

opportunities for meeting special security constraints in the system. Different

servers or types of servers may be assigned to manipulate different classes of

data or perform different types of operations. The interactions between the

various servers may be controlled, for example through the use of firewalls, to

prevent undesired accesses to data.

SSL

Building up SSL communication causes extra HTTP requests and responses

between the machines and every SSL message is encrypted on one side and

decrypted on the other side.

SSL at the front end or Web tier is a common implementation. This allows for

secured purchases, secure viewing of bank records, etc. SSL handshaking,

however, is expensive from a performance perspective. The Web server is

responsible for encrypting data sent to the user, and decrypting the request from

the user. If a site will be operated using SSL more often than not and the server

is operating at close to maximum CPU then using some form of SSL accelerator

hardware in the server, or even an external device that offloads all SSL traffic

prior to communication with the Web server, may help improve performance.

Performance related security settings

The following settings can help to fine-tune the security-related configurations to

enhance performance:

Security cache timeout

Its setting determines how long WebSphere should cache information related

to permission and security credentials. When the cache timeout expires, all

cached information becomes invalid. Subsequent requests for the information

result in a database lookup. Sometimes, acquiring the information requires

invoking an LDAP-bind or native authentication, both of which are relatively

costly operations in terms of performance.

HTTP session timeout

This parameter specifies how long a session will be considered active when it

is unused. After the timeout, the session expires and another session object

will be created. With high-volume Web sites, this may influence the

performance of the server.

Registry and database performance

Databases and registries used by an application influence the WebSphere

Application Server performance. In a distributed environment, it is highly

recommended that you put the authorization server onto a separate machine

in order to offload application processing. When planning for a secured

environment, special attention should be given to the sizing and

implementation of the directory server. It is a best practice to first tune the

12 WebSphere Application Server V6 Scalability and Performance Handbook

directory server and make sure that it performs well before looking at the

WebSphere environment. Also, make sure that this directory does not

introduce a new single point of failure.

1.2 WebSphere Application Server architecture

This section introduces the major components within WebSphere Application

Server. Refer to the Redpaper entitled WebSphere Application Server V6

architecture, REDP3918 and to the redbook WebSphere Application Server V6

System Management and Configuration Handbook, SG24-6451 for details about

the WebSphere Application Server architecture and components.

1.2.1 WebSphere Application Server Network Deployment

components

The following is a short introduction of each WebSphere Network Deployment

runtime component and their functions:

Deployment Manager

The Deployment Manager is part of an IBM WebSphere Application Server

Network Deployment V5.x and higher installation. It provides a central point of

administrative control for all elements in a distributed WebSphere cell. The

Deployment Manager is a required component for any vertical or horizontal

scaling and there exists only one specimen of it at a time.

Node Agent

The Node Agent communicates directly with the Deployment Manager, and is

used for configuration synchronization and administrative tasks such as

stopping/starting of individual application servers and performance

monitoring on the application server node.

Application server (instance)

An application server in WebSphere is the process that is used to run your

servlet and/or EJB-based applications, providing both Web container and EJB

container.

Web server and Web server plug-in

While the Web server is not strictly part of the WebSphere runtime,

WebSphere communicates with your Web server of choice: IBM HTTP Server

Note: WebSphere security is out of the scope of this redbook. However, an

entire redbook is dedicated to this topic. See WebSphere Application Server

V6: Security Handbook, SG24-6316 for details.

Chapter 1. Overview and key concepts 13

Apache, Sun ONE Web server/Sun Java System Web Server, and Covalent

Enterprise Ready Server via a plug-in. This plug-in communicates requests

from the Web server to the WebSphere runtime. In WebSphere Application

Server V6, the Web server and plug-in XML file can also be managed from

the Deployment Manager. For more information about this new function, see

3.4, “Web server topology in a Network Deployment cell” on page 78.

WebContainer Inbound Chain (formerly embedded HTTP transport)

The WebContainer Inbound Chain is a service of the Web container. An

HTTP client connects to a Web server and the HTTP plug-in forwards the

requests to the application server through the WebContainer Inbound Chain.

The communication type is either HTTP or HTTPS.

Although the WebContainer Inbound Chain is available in any WebSphere

Application Server to allow testing or development of the application

functionality prior to deployment, it should not be used for production

environments. See 6.2.1, “WebContainer Inbound Chains” on page 231 for

more details.

Default messaging provider

An embedded JMS server was introduced with WebSphere Application

Server V5 but has been totally rewritten for WebSphere Application Server V6

and is now called the default messaging provider. This infrastructure supports

both message-oriented and service-oriented applications, and is fully

integrated within WebSphere Application Server V6 (all versions). The default

messaging provider is a JMS 1.1 compliant JMS provider that supports

point-to-point and publish/subscribe styles of messaging.

Administrative service

The administrative service runs within each WebSphere Application Server

Java virtual machine (JVM). Depending on the version installed, there can be

administrative services installed in many locations. In WebSphere Application

Server V6 and in WebSphere Application Server - Express V6, the

administrative service runs in each application server. In a Network

Deployment configuration, the Deployment Manager, Node Agent, and

application server(s), all host an administrative service. This service provides

the ability to update configuration data for the application server and its

related components.

Administrative Console

The WebSphere Administrative Console provides an easy-to-use, graphical

“window” to a WebSphere cell and runs in a browser. In a Network

Deployment environment, it connects directly to the Deployment Manager,

which allows management of all nodes in the cell.

14 WebSphere Application Server V6 Scalability and Performance Handbook

Configuration repository

The configuration repository stores the configuration for all WebSphere

Application Server components inside a WebSphere cell. The Deployment

Manager communicates changes in configuration and runtime state to all cell

members through the administrative service and the Node Agent. Each

application server maintains a subset of the cell’s configuration repository,

which holds this application server’s configuration information.

Administrative cell

A cell is a set of application servers managed by the same Deployment

Manager. Some application server instances may reside on the same node,

others on different nodes. Cell members do not necessarily run the same

application.

Server cluster and cluster members

A server cluster is a set of application server processes (the cluster

members). They run the same application, but usually on different nodes. The

main purpose of clusters is load balancing and failover.

Figure 1-2 on page 15 shows the relationship between these components.

Chapter 1. Overview and key concepts 15

Figure 1-2 WebSphere configuration with Java clients and Web clients

1.2.2 Web clients

The basic configuration shown in Figure 1-1 on page 4 refers to a particular type

of client for WebSphere characterized as a Web client. Such a client uses a Web

browser to interact with the WebSphere Application Servers, through the

intermediary of a Web server (and plug-in), which in turn invokes a servlet within

WebSphere (you can also access static HTML pages).

Node Agent

Node

Application

Database

Config

repository

(subset)

Cell

Master

repository

(file)

Deployment Manager

Admin

application

Name Server (JNDI)

Admin Service

Scripting

client

Admin

SOAP or

RMI/IIOP

HTTP(s)

Web

browser

client

HTTP(s)

Config

repository

(subset)

EJB container

Web container

Webcontainer

Inbound

channel

Application Server

EJB container

Web container

Webcontainer

Inbound

channel

Application Server

Java client

Client container RMI/IIOP

HTTP server

WebSphere

plug-in

Cluster

16 WebSphere Application Server V6 Scalability and Performance Handbook

1.2.3 Java clients

Although Web clients constitute the majority of users of WebSphere today,

WebSphere also supports another type of client characterized as a Java client.

Such a client is a stand-alone Java program (GUI-based or not), which uses the

Java RMI/IIOP facilities to make direct method invocations on various EJB

objects within an application server, without going through an intervening Web

server and servlet, as shown in Figure 1-2 on page 15.

1.3 Workload management

Workload Management (WLM) is the process of spreading multiple requests for

work over the resources that can do the work. It optimizes the distribution of

processing tasks in the WebSphere Application Server environment. Incoming

work requests are distributed to the application servers and other objects that

can most effectively process the requests.

Workload management is also a procedure for improving performance,

scalability, and reliability of an application. It provides failover when servers are

not available.

Workload management is most effective when the deployment topology is

comprised of application servers on multiple machines, since such a topology

provides both failover and improved scalability. It can also be used to improve

scalability in topologies where a system is comprised of multiple servers on a

single, high-capacity machine. In either case, it enables the system to make the

most effective use of the available computing resources.

Workload management provides the following benefits to WebSphere

applications:

It balances client processing requests, allowing incoming work requests to be

distributed according to a configured WLM selection policy.

It provides failover capability by redirecting client requests to a running server

when one or more servers are unavailable. This improves the availability of

applications and administrative services.

It enables systems to be scaled up to serve a higher client load than provided

by the basic configuration. With clusters and cluster members, additional

instances of servers can easily be added to the configuration. See 1.3.3,

“Workload management using WebSphere clustering” on page 19 for details.

It enables servers to be transparently maintained and upgraded while

applications remain available for users.

It centralizes administration of application servers and other objects.

Chapter 1. Overview and key concepts 17

Two types of requests can be workload-managed in IBM WebSphere Application

Server Network Deployment V6:

HTTP requests can be distributed across multiple Web containers, as

described in 1.3.2, “Plug-in workload management” on page 17. We refer to

this as Plug-in WLM.

EJB requests can be distributed across multiple EJB containers, as described

in 1.3.4, “Enterprise Java Services workload management” on page 23. We

refer to this as EJS WLM.

1.3.1 Web server workload management

If your environment uses multiple Web servers, a mechanism is needed to allow

servers to share the load. The HTTP traffic must be spread among a group of

servers. These servers must appear as one server to the Web client (browser),

making up a cluster, which is a group of independent nodes interconnected and

working together as a single system. This cluster concept should not be

confused with a WebSphere cluster as described in 1.3.3, “Workload

management using WebSphere clustering” on page 19.

As shown in Figure 1-3, a load balancing mechanism called IP spraying can be

used to intercept the HTTP requests and redirect them to the appropriate

machine on the cluster, providing scalability, load balancing, and failover.

See Chapter 4, “Introduction to WebSphere Edge Components” on page 99 for

details.

Figure 1-3 Web server workload management

1.3.2 Plug-in workload management

This is a short introduction into plug-in workload management, where servlet

requests are distributed to the Web container in clustered application servers, as

shown in Figure 1-4 on page 18. This configuration is also referred to as the

servlet clustering architecture.

IP Sprayer

HTTP

Web Requests

Client

Caching

Proxy

Plugin

HTTP Server

Plugin

HTTP Server

HTTP

Requests

18 WebSphere Application Server V6 Scalability and Performance Handbook

Figure 1-4 Plug-in (Web container) workload management

Clustering application servers that host Web containers automatically enables

plug-in workload management for the application servers and the servlets they

host. In the simplest case, the cluster is configured on a single machine, where

the Web server process also runs.

Routing of servlet requests occurs between the Web server plug-in and the

clustered application servers using HTTP or HTTPS. This routing is based purely

on the weights associated with the cluster members. If all cluster members have

identical weights, the plug-in sends an equal number of requests to all members

of the cluster, when assuming no session affinity. If the weights are scaled in the

range from zero to 20, the plug-in routes requests to those cluster members with

the higher weight value more often. A rule of thumb formula for determining

routing preference would be:

% routed to Server1 = weight1 / (weight1+weight2+...+weightn)

where n is the number of cluster members in the cluster. The Web server plug-in

distributes requests around cluster members that are not available.

See Chapter 6, “Plug-in workload management and failover” on page 227 for

details.

Servlet

Requests

Application Server

Web

Container

Web

Container

Plugin

HTTP Server

Chapter 1. Overview and key concepts 19

1.3.3 Workload management using WebSphere clustering

This section describes how workload management is implemented in IBM

WebSphere Application Server Network Deployment V6 by using application

server clusters and cluster members. How to create clusters and cluster

members is detailed in Chapter 8, “Implementing the sample topology” on

page 387.

A cluster is a set of application servers that are managed together and

participate in workload management. Application servers participating in a

cluster can be on the same node or on different nodes. A Network Deployment

cell can contain no clusters, or have many clusters depending on the need of the

administration of the cell.

The cluster is a logical representation of the application servers. It is not

necessarily associated with any node, and does not correspond to any real

server process running on any node. A cluster contains only application servers,

and the weighted workload capacity associated with those servers.

When creating a cluster, it is possible to select an existing application server as

the template for the cluster without adding that application server into the new

cluster (the chosen application server is used only as a template, and is not

affected in any way by the cluster creation). All other cluster members are then

created based on the configuration of the first cluster member.

Cluster members can be added to a cluster in various ways: during cluster

creation and afterwards. During cluster creation, one existing application server

can be added to the cluster and/or one or more new application servers can be

created and added to the cluster. There is also the possibility of adding additional

members to an existing cluster later on. Depending on the capacity of your

systems, you can define different weights for the various cluster members.

It may be a good idea to create the cluster with a single member first, adjust the

member's configuration and then add the other members. This process

guarantees that all cluster members are created with the same settings. Cluster

members are required to have identical application components, but they can be

sized differently in terms of weight, heap size, and other environmental factors.

You must be careful though not to change anything that might result in different

application behavior on each cluster member. This concept allows large

enterprise machines to belong to a cluster that also contains smaller machines

such as Intel® based Windows servers.

Starting or stopping the cluster automatically starts or stops all cluster members,

and changes to the application are propagated to all application servers in the

cluster.

20 WebSphere Application Server V6 Scalability and Performance Handbook

Figure 1-5 shows an example of a possible configuration that includes server

clusters. Server cluster 1 has two cluster members on node B only. Server

cluster 2, which is completely independent of server cluster 1, has two cluster

members on node A and three cluster members on node B. Finally, node A also

contains a free-standing application server that is not a member of any cluster.

Figure 1-5 Server clusters and cluster members

Clusters and cluster members provide the necessary support for workload

management, failover, and scalability. For details, please refer to Chapter 6,

“Plug-in workload management and failover” on page 227.

Distributing workloads

The ability to route a request to any server in a group of clustered application

servers allows the servers to share work and improving throughput of client

requests. Requests can be evenly distributed to servers to prevent workload

imbalances in which one or more servers has idle or low activity while others are

overburdened. This load balancing activity is a benefit of workload management.

Using weighted definitions of cluster members allows nodes to have different

hardware resources and still participate in a cluster. The weight specifies that the

application server with a higher weight will be more likely to serve the request

faster, and workload management will consequently send more requests to that

node.

Cluster

Members in

Server

Cluster 1

Cluster

Members in

Server

Cluster 2

Servlet EJB

Application Server/Cluster Member

Application Server/Cluster Member Servlet EJB

Application Server/Cluster Member

Servlet EJB

Application Server/Cluster Member

Servlet EJB

Application Server/Cluster Member

Node A

Servlet EJB

Application Server

Node B

Servlet EJB

Server Cluster 2

Server Cluster 1

Chapter 1. Overview and key concepts 21

Failover

With several cluster members available to handle requests, it is more likely that

failures will not negatively affect throughput and reliability. With cluster members

distributed to various nodes, an entire machine can fail without any application

downtime. Requests can be routed to other nodes if one node fails. Clustering

also allows for maintenance of nodes without stopping application functionality.

Scalability

Clustering is an effective way to perform vertical and horizontal scaling of

application servers.

In vertical scaling, shown in Figure 1-6, multiple cluster members for an

application server are defined on the same physical machine, or node, which

may allow the machine’s processing power to be more efficiently allocated.

Even if a single JVM can fully utilize the processing power of the machine, you

may still want to have more than one cluster member on the machine for other

reasons, such as using vertical clustering for software failover. If a JVM

reaches a table/memory limit (or if there is some similar problem), then the

presence of another process provides for failover.

Figure 1-6 Vertical scaling

We recommend that you avoid using “rules of thumb” when determining the

number of cluster members needed for a given machine. The only way to

determine what is correct for your environment and application(s) is to tune a

single instance of an application server for throughput and performance, then

EJB

Container

Web

Container

Cluster 1, Member 1

Node A

EJB

Container

Web

Container

Cluster 1, Member 2

Cluster 1

22 WebSphere Application Server V6 Scalability and Performance Handbook

add it to a cluster, and incrementally add additional cluster members. Test

performance and throughput as each member is added to the cluster. Always

monitor memory usage when you are configuring a vertical scaling topology

and do not exceed the available physical memory on a machine.

In general, 85% (or more) utilization of the CPU on a large server shows that

there is little, if any, performance benefit to be realized from adding additional

cluster members.

In horizontal scaling, shown in Figure 1-7, cluster members are created on

multiple physical machines. This allows a single WebSphere application to

run on several machines while still presenting a single system image, making

the most effective use of the resources of a distributed computing

environment. Horizontal scaling is especially effective in environments that

contain many smaller, less powerful machines. Client requests that

overwhelm a single machine can be distributed over several machines in the

system. Failover is another benefit of horizontal scaling. If a machine

becomes unavailable, its workload can be routed to other machines

containing cluster members.

Figure 1-7 Horizontal scaling

Horizontal scaling can handle application server process failures and

hardware failures (or maintenance) without significant interruption to client

service.

WebSphere applications can combine horizontal and vertical scaling to reap

the benefits of both scaling techniques, as shown in Figure 1-8 on page 23.

Note: WebSphere Application Server V5.0 and higher supports horizontal

clustering across different platforms and operating systems. Horizontal

cloning on different platforms was not supported in WebSphere Application

Server V4.

EJB

Container

Web

Container

Cluster 1 , M ember 1

Node A

EJB

Container

Web

Container

Cluster 1, Member 2

Node B

Cluster 1

Chapter 1. Overview and key concepts 23

Figure 1-8 Vertical and horizontal scaling

Secure application cluster members

The workload management service has its own built-in security, which works with

the WebSphere Application Server security service to protect cluster member

resources. If security is needed for your production environment, enable security

before you create a cluster for the application server. This enables security for all

of the members in that cluster.

The EJB method permissions, Web resource security constraints, and security

roles defined in an enterprise application are used to protect EJBs and servlets

in the application server cluster. Refer to WebSphere Application Server V6:

Security Handbook, SG24-6316 for more information.

1.3.4 Enterprise Java Services workload management

In this section, we discuss Enterprise Java Services workload management (EJS

WLM), in which the Web container and EJB container are on different application

servers. The application server hosting the EJB container is clustered.

Configuring the Web container in a separate application server from the

Enterprise JavaBean container (an EJB container handles requests for both

session and entity beans) enables distribution of EJB requests between the EJB

container clusters, as seen in Figure 1-9 on page 24.

EJB

Container

Web

Container

Cluster 1, Member 1

Node A

EJB

Container

Web

Container

Cluster 1, Member 2

EJB

Container

Web

Container

Cluster 1, Member 3

Node B

EJB

Container

Web

Container

Cluster 1, Member 4

Cluster 1

24 WebSphere Application Server V6 Scalability and Performance Handbook

Figure 1-9 EJB workload management

In this configuration, EJB client requests are routed to available EJB containers

based on the workload management EJB selection policy (Server-weighted

round robin routing or Prefer local).

The EJB clients can be servlets operating within a Web container, stand-alone

Java programs using RMI/IIOP, or other EJBs.

EJS workload management is covered in detail in Chapter 7, “EJB workload

management” on page 341.

1.4 Managing session state among servers

All the load distribution techniques discussed in this book rely, on one level or

another, on using multiple copies of an application server and arranging for

multiple consecutive requests from various clients to be serviced by different

servers.

Important: Although it is possible to split the Web container and EJB

container, it is not recommended because of the negative performance

impact. In addition, application maintenance also becomes more complex

when the application runs in different application servers.

EJB

Requests

Application Server

EJB

Container

Application Server

EJB

Container

EJB

Requests Java

Client

Application Server

Web

Container

Chapter 1. Overview and key concepts 25

If each client request is completely independent of every other client request,

then it does not matter if two requests are processed on the same server or not.

However, in practice, there are many situations where all requests from an

individual client are not totally independent. In many usage scenarios, a client

makes one request, waits for the result, then makes one or more subsequent

requests that depend upon the results received from the earlier requests.

Such a sequence of operations on behalf of one client falls into one of two

categories:

Stateless: the server that processes each request does so based solely on

information provided with that request itself, and not on information that it

“remembers” from earlier requests. In other words, the server does not need

to maintain state information between requests.

Stateful: the server that processes a request does need to access and

maintain state information generated during the processing of an earlier

request.

Again, in the case of stateless interactions, it does not matter if different requests

are being processed by different servers. But in the case of stateful interactions,

we must ensure that whichever server is processing a request has access to the

state information necessary to service that request. This can be ensured either

by arranging for the same server to process all the client requests associated

with the same state information, or by arranging for that state information to be

shared and equally accessible by all servers that may require it. In that last case,

it is often advantageous to arrange for most accesses to the shared state to be

performed from the same server, so as to minimize the communications

overhead associated with accessing the shared state from multiple servers.

This consideration for the management of stateful interactions is a factor in the

discussions of various load distribution techniques throughout this book. It also

requires the discussion of a specific facility, the session management facility, and

of server affinity.

1.4.1 HTTP sessions and the session management facility

In the case of an HTTP client interacting with a servlet, the state information

associated with a series of client requests is represented as an HTTP session,

and identified by a session ID. The session manager module that is part of each

Web container is responsible for managing HTTP sessions, providing storage for

session data, allocating session IDs, and tracking the session ID associated with

each client request, through the use of cookies, URL rewriting, or SSL ID

techniques.

26 WebSphere Application Server V6 Scalability and Performance Handbook

The session manager provides for the storage of session-related information

either in-memory within the application server, in which case it cannot be shared

with other application servers, in a back-end database, shared by all application

servers, or by using memory-to-memory replication.

The second option, sometimes referred to as persistent sessions or session

clustering, is one method that uses HTTP sessions with the load-distribution

configuration. With this option, whenever an application server receives a

request associated with a session ID, which is not in memory, it can obtain it by

accessing the back-end database, and can then serve the request. When this

option is not enabled, and another clustering mechanism is not used, if any load

distribution mechanism happens to route an HTTP request to an application

server other than the one where the session was originally created, that server

would be unable to access the session, and would thus not produce correct

results in response to that request. One drawback to the database solution, just

as with application data, is that it provides a single point of failure (SPOF) so it

should be implemented in conjunction with hardware clustering products such as

HACMP or solutions such as database replication. Another drawback is the

performance hit, caused by database disk I/O operations and network

communications.

The last option, memory-to-memory replication, provides a mechanism for

session clustering within IBM WebSphere Application Server Network

Deployment. It uses the built-in Data Replication Service (DRS) of WebSphere to

replicate session information stored in memory to other members of the cluster.

Using this functionality removes the single point of failure that is present in

persistent sessions through a database solution that has not been made highly

available using clustering software. The sharing of session state is handled by

creating a replication domain and then configuring the Web container to use that

replication domain to replicate session state information to the specified number

of application servers. DRS, like database replication, also incurs a performance

hit, primarily because of the overhead of network communications. Additionally,

since copies of the session object reside in application server memory this

reduces the available heap for application requests and usually results in more

frequent garbage collection cycles by the application server JVM.

Chapter 1. Overview and key concepts 27

Storing session states in a persistent database or using memory-to-memory

replication provides a degree of fault tolerance to the system. If an application

server crashes or stops, any session state that it may have been working on

would normally still be available either in the back-end database or in another still

running application server’s memory, so that other application servers can take

over and continue processing subsequent client requests associated with that

session.

More information can be found in 6.8, “HTTP session management” on

page 279.

Note: DRS, and thus the session replication mechanism, has changed in

WebSphere V6. Configuration has been greatly simplified. The default

configuration setting is now to have a single replica. You can also replicate to

the entire domain (which corresponds to the peer-to-peer configuration in

WebSphere V5.x) or specify the number of replicas.

Performance measurements showed that the overhead associated with

replicating the session to every other cluster member is more significant than

contention introduced by using the database session repository (the more

application servers in the cluster, the more overhead for memory-to-memory

replication). Therefore, it is not recommended that you use the Entire Domain

setting for larger replication domains.

For larger session sizes, the overhead of session replication increases.

Database replication has a lower overall throughput than memory-to-memory,

due to database I/O limitations (the database becomes the bottleneck).

However, while database replication with large sessions performs slower, it is

doing so with less CPU power than memory-to-memory replication. The

unused processor power can be used by other tasks on the system.

Important: Neither mechanism provides a 100% guarantee that a session

state will be preserved in case of a server crash. If a server happens to crash

while it is literally in the middle of modifying the state of a session, some

updates may be lost and subsequent processing using that session may be

affected. Also, in the case of memory replication, if the node crashes during a

replication, or in between the time interval that has been set for replicating

session information, then some session information may be lost.

However, such a situation represents only a very small window of vulnerability

and a very small percentage of all occurrences throughout the life of a system

in production.

28 WebSphere Application Server V6 Scalability and Performance Handbook

1.4.2 EJB sessions or transactions

In the case of an EJB client interacting with one or more EJBs, the management

of state information associated with a series of client requests is governed by the

EJB specification and implemented by the WebSphere EJS container, and

depends on the types of EJBs that are the targets of these requests.

Stateless session bean

By definition, when interacting with a stateless session bean, there is no

client-visible state associated with the bean. Every client request directed to a

stateless session bean is independent of any previous request that was directed

to the same bean. The container maintains a pool of instances of stateless

session beans of each type, and will provide an arbitrary instance of the

appropriate bean type whenever a client request is received. It does not matter if

the same actual bean instance is used for consecutive requests, or even if two

consecutive requests are serviced by bean instances in the same application

server.

Stateful session bean

In contrast, a stateful session bean is used precisely to capture state information

that must be shared across multiple consecutive client requests that are part of

one logical sequence of operations. The client must take special care to ensure

that it is always accessing the same instance of the stateful session bean, by

obtaining and keeping an EJB object reference to that bean. The various

load-distribution techniques available in WebSphere make special provisions to

support this characteristic of stateful session beans.

A new feature in WebSphere V6 is the failover support for stateful session beans;

the state information is now replicated to other application servers in the cluster

using the Data Replication Service (DRS). If an application server fails, a new

instance of the bean is created on a different server, the state information is

recovered, requests are directed to the recovered instance and processing

continues.

Entity bean

Finally, we must consider the case of an entity bean. Most external clients access

WebSphere services through session beans, but it is possible for an external

client to access an entity bean directly. Furthermore, a session bean inside

WebSphere is itself often acting as a client to one or more entity beans also

inside WebSphere; if load distribution features are used between that session

bean and its target entity bean, then the same questions arise as with plain

external clients.

Chapter 1. Overview and key concepts 29

Strictly speaking, the information contained in an entity bean is not usually

associated with a “session” or with the handling of one client request or series of

client requests. However, it is common for one client to make a succession of

requests targeted at the same entity bean instance. Unlike all the previous

cases, it is possible for multiple independent clients to access the same entity

bean instance more or less concurrently. Therefore, it is important that the state

contained in that entity bean be kept consistent across the multiple client

requests.

For entity beans, the notion of a session is more or less replaced by the notion of

transaction. For the duration of one client transaction to which it participates, the

entity bean is instantiated in one container (normally the container where the first

operation within that transaction was initiated). All subsequent accesses to that

same bean, within that same transaction, must be performed against that same

instance in that same container.

In between transactions, the handling of the entity bean is specified by the EJB

specification, in the form of a number of caching options:

With option A caching, WebSphere Application Server assumes that the

entity bean is used within a single container. Clients of that bean must direct

their requests to the bean instance within that container. The entity bean has

exclusive access to the underlying database, which means that the bean

cannot be clustered or participate in workload management if option A

caching is used.

With option B caching, the bean instance remains active (so it is not

guaranteed to be made passive at the end of each transaction), but it is

always reloaded from the database at the start of each transaction. A client

can attempt to access the bean and start a new transaction on any container

that has been configured to host that bean.

With option C caching (which is the default), the entity bean is always

reloaded from the database at the start of each transaction and made passive

at the end of each transaction. A client can attempt to access the bean and

start a new transaction on any container that has been configured to host that

bean. This is effectively similar to the session clustering facility described for

HTTP sessions: the shared state is maintained in a shared database, and can

be accessed from any server when required.

Message-driven beans

The message-driven bean (MDB) was introduced with WebSphere V5. Support

for MDBs is a requirement of a J2EE 1.3 compliant application server. In

WebSphere V4.0, the Enterprise Edition offered a similar functionality called

message beans that leveraged stateless session EJBs and a message listener

service. That container, however, did not implement the EJB 2.0 specification.

30 WebSphere Application Server V6 Scalability and Performance Handbook

The MDB is a stateless component that is invoked by a J2EE container when a

JMS message arrives at a particular JMS destination (either a queue or topic).

Loosely, the MDB is triggered by the arrival of a message.

Messages are normally anonymous. If some degree of security is desired, the

listener will assume the credentials of the application server process during the

invocation of the MDB.

MDBs handle messages from a JMS provider within the scope of a transaction. If

transaction handling is specified for a JMS destination, the listener will start a

global transaction before reading incoming messages from that destination. Java

Transaction API (JTA) transaction control for commit or rollback is invoked when

the MDB processing has finished.

1.4.3 Server affinity

The discussion above implies that any load-distribution facility, when it chooses a

server to direct a request, is not entirely free to select any available server:

In the case of stateful session beans or entity beans within the context of a

transaction, there is only one valid server. WebSphere WLM will always direct

a client's access of a stateful session bean to the single server instance

containing the bean (there is no possibility of choosing the wrong server

here). If the request is directed to the wrong server (for example because of a

configuration error), it will either fail, or that server itself will be forced to

forward the request to the correct server, at great performance cost.

In the case of clustered HTTP sessions or entity beans between transactions,

the underlying shared database ensures that any server can correctly

process each request. However, accesses to this underlying database may

be expensive, and it may be possible to improve performance by caching the

database data at the server level. In such a case, if multiple consecutive

requests are directed to the same server, they may find the required data still

in the cache, and thereby reduce the overhead of access to the underlying

database.

The characteristics of each load-distribution facility, which take these constraints

into account, are generally referred to as server affinity: In effect, the load

distribution facility recognizes that multiple servers may be acceptable targets for

a given request, but it also recognizes that each request may have a particular

affinity for being directed to a particular server where it may be handled better or

faster.

Chapter 1. Overview and key concepts 31

We will encounter this notion of server affinity throughout the discussion of the

various load-distribution facilities in Chapter 6, “Plug-in workload management

and failover” on page 227 and Chapter 7, “EJB workload management” on

page 341. In particular, we will encounter the notion of session affinity, where the

load distribution facility recognizes the existence of a session and attempts to

direct all requests within that session to the same server, and we will also discuss

the notion of transaction affinity, in which the load distribution facility recognizes

the existence of a transaction, and behaves similarly.

Finally, we will also see that a particular server affinity mechanism can be weak

or strong. In a weak affinity mechanism, the system attempts to enforce the

desired affinity for the majority of requests most of the time, but may not always

be able to provide a total guarantee that this affinity will be respected. In a strong

affinity mechanism, the system guarantees that affinity will always be strictly

respected, and generates an error when it cannot.

1.5 Performance improvements over previous versions

WebSphere Application Server V6 contains a significant set of functional and

performance improvements over the initial release of WebSphere Version 5.0

and the incremental releases that followed. Most of the improvements listed

below are as compared to WebSphere V5.1.

Improved performance with Java V1.4.2

Java Virtual Machine (JVM) enhancements.

Various class-level enhancements for logging, BigDecimal, NumberFormat,

date formatting and XML parsing.

Reduced lock contention for improved ORB scalability.

Improved Web container performance/scalability

Channel Framework

The Channel Framework with NIO (a new Java framework for input/output)

supports non-blocking IO allowing for a large number of

concurrently-connected clients to be supported, which helps client scalability

by enabling the WebSphere Application Server containers to handle more

client requests with a fewer number of threads than was required previously.

Code path improvements

The Web container is redesigned and reimplemented for significant

reductions in code path and memory utilization.

32 WebSphere Application Server V6 Scalability and Performance Handbook

Caching enhancements

The Web container redesign also contains internal improvements for caching

various types of objects inside the Web container. These caching

improvements provide better performance and scalability.

Improved HTTP session replication using a new high-performing transport

The Domain Replication Service is rebased on the HAManager framework,

which uses a multicast/unicast based transport (DCS), which itself uses the

channel framework.

See Chapter 6, “Plug-in workload management and failover” on page 227 and

Chapter 9, “WebSphere HAManager” on page 465 for more information about

this topic.

EJB improvements

Cached data verification (functional improvement)

The new cached data verification in the EJB container allows for aggressive

caching of data while maintaining consistency with the database by

performing an optimistic versioning field check at transaction commit time.

This increases performance of transactions and maintains data consistency

required in high transaction rate environments.

All EJB applications can benefit from improved EJB performance due to:

Code path lengths which are reduced throughout the runtime.

Application profiling

The application profiling technology (which was previously only available in

IBM WebSphere Application Server Enterprise) improves overall entity EJB

performance and throughput by fine tuning the run-time behavior by enabling

EJB optimizations to be customized for multiple user access patterns without

resorting to "worst case" choices (such as Pessimistic Update on

findByPrimaryKey, regardless of whether the client needs it for read or for

actual update).

Higher-performance Access Intent settings

An automatic analysis tool is provided that determines the most performant

access intent setting on each entity bean in the application, for each

application scenario in which that entity bean may be accessed.

Stateful session bean replication

The new stateful session bean replication is using a new high performance

transport.

Chapter 1. Overview and key concepts 33

Improved persistence manager

The improved persistence manager allows for more caching options and

better database access strategies.

Applications with compatible design patterns can exploit the following EJB

improvements:

Keep-update-locks optimization for DB2 UDB V8.2

This enables applications to use a row blocking protocol for data fetching and

also avoid deadlocks by enforcing update locks on affected rows. This feature

enhances performance for applications that use search updates, such as EJB

CMP Entity Applications.

Read-only beans

The new "Read Only Beans" optimization provides enhanced EJB caching

performance.

Improved read-ahead

Improved Read-ahead functionality enables reading in data from multiple

CMP with a single SQL statement.

See Chapter 7, “EJB workload management” on page 341 for more information

about this topic.

Improved Web Services performance

Web Services performance improvements can be seen in several areas:

Improved deserialization

Improvements in V6.0 tooling generates higher-performing deserializers for

all JAX-RPC beans. Because of this, redeploying a V5.x application into V6.0

may significantly decrease the processing time for large messages.

Web Services caching

Additional capabilities are added for Web Services caching. The V6.0

serialization/deserialization runtime is enhanced to cache frequently used

serializers/deserializers, significantly decreasing the processing time for large

messages. Web Services caching also includes JAX/RPC client side Web

Services caching (first introduced in V5.1.1).

Other Web Services improvements come in the areas of in-process

performance enhancements.

Refer to the redbook WebSphere Version 6 Web Services Handbook

Development and Deployment, SG24-6461 for more information about Web

Services.

34 WebSphere Application Server V6 Scalability and Performance Handbook

Default messaging provider

The embedded JMS server from V5.x has been replaced by a new, in-process

Java messaging engine that makes nonpersistent messaging primitives up to 5x

faster.

The improved default messaging provider for V6.0 is a new all-Java

implementation that runs within the WebSphere Application Server JVM process

and uses a relational database for its persistent store. It provides improved

function and performance over V5.1, particularly in the nonpersistent messaging

scenarios. Nonpersistent point-to-point in-process messaging (EJB to MDB) is

up to 5 times faster than V5.1. See Chapters 10 and 11 of the redbook

WebSphere Application Server V6 System Management and Configuration

Handbook, SG24-6451 for more information.

Improved Dynamic Fragment Caching

DMap caching (or DistributedMap caching), which was previously only available

in IBM WebSphere Application Server Enterprise, performs better than

command caching because of how the caching mechanisms are implemented.

See Chapter 10, “Dynamic caching” on page 501 for more information.

Type 4 JDBC driver

The Universal JDBC Driver performs better than the legacy/CLI JDBC driver, due

mainly to significant code path reductions, a reduced dependency on JNI, and

optimizations not available to the legacy/CLI JDBC driver architecture.

Tip: The IBM WebSphere Performance team publishes a performance report

for each release of WebSphere Application Server. This performance report

provides performance information using the Trade benchmark, information

about the new default messaging provider, memory considerations, and much

more. To obtain a copy of the WebSphere performance report, please contact

your IBM sales or technical support contact.

Chapter 1. Overview and key concepts 35

1.6 The structure of this redbook

This redbook is divided into six parts containing the following chapters and

appendixes:

Part 1: Getting started

Chapter 2, “Infrastructure planning and design” on page 39 introduces scalability

and scaling techniques, and helps you to evaluate the components of your

e-business infrastructure for their scalability. Basic considerations regarding

infrastructure deployment planning, sizing, benchmarking, and performance

tuning are also covered in this chapter.

Chapter 3, “Introduction to topologies” on page 71 discusses topologies

supported by WebSphere Application Server V6 and the variety of factors that

come into play when considering the appropriate topology choice.

Part 2: Distributing the workload

Chapter 4, “Introduction to WebSphere Edge Components” on page 99 gives an

overview of the capabilities and options of WebSphere Edge Components.

Chapter 5, “Using IBM WebSphere Edge Components” on page 127 looks at

Web server load balancing with the WebSphere Edge Components of IBM

WebSphere Application Server Network Deployment V6, describing some

important configurations and its use with the IBM WebSphere Application Server

Network Deployment V6. An introduction to the Caching Proxy is also included in

this chapter.

Chapter 6, “Plug-in workload management and failover” on page 227 covers Web

container workload management. It discusses components, configuration

settings, and resulting behaviors. It also covers session management in a

workload-managed environment.

Chapter 7, “EJB workload management” on page 341 examines how EJB

requests can be distributed across multiple EJB containers.

Part 3: Implementing the solution

Chapter 8, “Implementing the sample topology” on page 387 provides

step-by-step instructions for implementing a sample multi-machine environment.

This environment is used to illustrate most of IBM WebSphere Application Server

Network Deployment V6’s workload management and scalability features

throughout this book, including a Caching Proxy and Load Balancer.

36 WebSphere Application Server V6 Scalability and Performance Handbook

Part 4: High availability and caching

Chapter 9, “WebSphere HAManager” on page 465 introduces the new

HAManager feature of WebSphere Application Server V6 and shows some basic

usage and configuration examples.

Chapter 10, “Dynamic caching” on page 501 describes the use of caching

technologies in three-tier and four-tier application environments involving

browser-based clients and applications using WebSphere Application Server.

Part 5: Messaging

Chapter 11, “Using asynchronous messaging for scalability and performance” on

page 621 gives an introduction into the JMS 1.1 API.

Chapter 12, “Using and optimizing the default messaging provider” on page 643

introduces the new default messaging provider that is part of WebSphere

Application Server V6. It also explains configuration options in a clustered

application server environment. This chapter extends Chapters 10 and 11 of the

redbook WebSphere Application Server V6 System Management and

Configuration Handbook, SG24-6451 and should thus be read in conjunction

with these chapters.

Chapter 13, “Understanding and optimizing the use of WebSphere MQ” on

page 697 describes how the various components for WebSphere MQ interact,

takes you through manually configuring the components for the Trade 6

application, and demonstrates running this with WebSphere MQ and WebSphere

Business Integration Event broker.

Part 6: Performance monitoring, tuning and coding practices

Chapter 14, “Server-side performance and analysis tools” on page 769 explains

the tools that can be run on the application server side to analyze and monitor

the performance of the environment. The topics covered include the Performance

Monitoring Infrastructure, Request Metrics, Tivoli Performance Viewer, and the

Performance Advisors.

Chapter 15, “Development-side performance and analysis tools” on page 839

offers an introduction into the tools that come with IBM Rational Application

Developer V6.0, such as the profiling tools or the separately downloadable Page

Detailer.

Chapter 16, “Application development: best practices for application design,

performance and scalability” on page 895 gives suggestions for developing

WebSphere Application Server based applications.

Chapter 17, “Performance tuning” on page 939 discusses tuning an existing

environment for performance and provides a guide for testing application servers

Chapter 1. Overview and key concepts 37

and components of the total serving environment that should be examined and

potentially tuned to increase the performance of the application

Part 6: Appendixes

Appendix A, “Sample URL rewrite servlet” on page 1033 explains how to set up

an example to test session management using URL rewrites.

Appendix B, “Additional material” on page 1037 gives directions for downloading

the Web material associated with this redbook.

38 WebSphere Application Server V6 Scalability and Performance Handbook

Chapter 2. Infrastructure planning and

design

Are you wondering about how to plan and design an infrastructure deployment

based on WebSphere middleware? This chapter covers the WebSphere-specific

components that you need to be familiar with to run a successful WebSphere

infrastructure project.

This chapter is organized in the following sections:

Infrastructure deployment planning

Design for scalability

Sizing

Benchmarking

Performance tuning

40 WebSphere Application Server V6 Scalability and Performance Handbook

2.1 Infrastructure deployment planning

This section gives a general overview of which phases you have to go through

during a project, how you gather requirements, and how to apply them to a

WebSphere project.

Typically, a new project starts with only a concept. Very little is known about

specific implementation details, especially as they relate to the infrastructure.

Hopefully, your development team and infrastructure team work closely together

to help bring some scope to the needs of the overall application environment.

In some cases, the involvement of an IBM Design Center for e-business on

demand™ might be useful. For more information about this service, please see

2.1.1, “IBM Design Centers for e-business on demand” on page 41.

Bringing together a large team of people can create an environment that helps

hone the environment requirements. If unfocused, however, a large team can be

prone to wandering aimlessly and creating more confusion than resolving issues.

For this reason, pay close attention to the size of the requirements team, and

keep the meetings focused. Provide template documents to be completed by the

developers, the application business owners, and the user experience team. Try

to gather information that falls into the following categories:

Functional requirements, which are usually determined by the business-use

of the application and are related to function.

Non-functional requirements, which start to describe the properties of the

underlying architecture and infrastructure like reliability, availability, or

security.

Capacity requirements, which include traffic estimates, traffic patterns, and

expected audience size.

Requirement gathering is an iterative process. There is no way, especially in the

case of Web-based applications, to have absolutes for every item. The best you

can do is create an environment that serves your best estimates, and then

monitor closely to adjust as necessary after launch. Make sure that all your plans

are flexible enough to deal with future changes in requirements and always keep

in mind that they may have an impact on every other part of your project.

With this list of requirements, you can start to create the first drafts of your

designs. You should target developing at least the following designs:

Application design

To create your application design, you use your functional and non-functional

requirements to create guidelines for your application developers about how

your application is built.

Chapter 2. Infrastructure planning and design 41

This redbook does not attempt to cover the specifics of a software

development cycle. There are multiple methodologies for application design,

and volumes dedicated to best practices. However, Chapter 16, “Application

development: best practices for application design, performance and

scalability” on page 895 will give you a quick start to this topic.

Implementation design

This design defines your target deployment infrastructure on which your

application will be deployed.

The final version of this implementation design will contain every detail about

hardware, processors, and software installed on the different components, but

you do not start out with all these details in the beginning. Initially, your

implementation design will simply list component requirements such as a

database, a set of application servers, a set of Web servers, and whatever

other components are defined in the requirements phase.

This design might need to be extended during your project, whenever a

requirement for change occurs, or when you get new sizing information. Too

often, however, the reality is that a project may require new hardware, and

therefore be constrained by capital acquisition requirements. Try to not go

back too often for additional resources once the project has been accepted.

Information that helps you to create this design is found in 2.2, “Design for

scalability” on page 42.

With the two draft designs in hand, you can now begin the process of formulating

counts of servers, network requirements, and the other infrastructure related

items. Basic information about sizing can be found in 2.3, “Sizing” on page 58.

In some cases, it might be appropriate to perform benchmark tests. There are

many ways to perform benchmarking tests, and some of these methods are

described in 2.4, “Benchmarking” on page 59.

The last step in every deployment is to tune your system and measure if it can

handle the projected load specified in your non-functional requirements. Section

2.5, “Performance tuning” on page 62 covers in more detail how to plan for load

tests.

2.1.1 IBM Design Centers for e-business on demand

The IBM Design Centers for e-business on demand are state-of-the-art facilities

where certified IT architects work with customers from around the world to

design, architect, and prototype advanced IT infrastructure solutions to meet the

challenges of on demand computing. Customers are nominated by their IBM

representative based upon their current e-business plans and are usually

designing a first-of-a-kind e-business infrastructure to solve a unique business

42 WebSphere Application Server V6 Scalability and Performance Handbook

problem. After a technical and business review, qualified customers come to a

Design Center for a Design Workshop which typically lasts three to five days. If

appropriate, a proof-of-concept session ranging from two to eight weeks may

follow.

The centers are located in:

Poughkeepsie, USA

Montpellier, France

Makuhari, Japan

For more information about this service, please go to

http://www-1.ibm.com/servers/eserver/design_center/

2.2 Design for scalability

Understanding the scalability of the components in your e-business infrastructure

and applying appropriate scaling techniques can greatly improve availability and

performance. Scaling techniques are especially useful in multi-tier architectures,

when you want to evaluate components associated with IP load balancers such

as dispatchers or edge servers, Web presentation servers, Web application

servers, data servers and transaction servers.

You can use the following high-level steps to classify your Web site and identify

scaling techniques that are applicable to your environment:

1. Understanding the application environment

2. Categorizing your workload

3. Determining the most affected components

4. Selecting the scaling techniques to apply

5. Applying the technique(s)

6. Re-evaluating

This systematic approach needs to be adapted to your situation. We look at each

step in detail in the following sections.

2.2.1 Understanding the application environment

For existing environments, the first step is to identify all components and

understand how they relate to each other. The most important task is to

understand the requirements and flow of the existing application(s) and what can

or cannot be altered. The application is a significant contributor to the scalability

of any infrastructure, so a detailed understanding is crucial to scale effectively. At

a minimum, application analysis must include a breakdown of transaction types

and volumes as well as a graphic view of the components in each tier.

Chapter 2. Infrastructure planning and design 43

Figure 2-1 can aid in determining where to focus scalability planning and tuning

efforts. The figure shows the greatest latency for representative customers in

three workload patterns, and on which tier the effort should be concentrated.

Figure 2-1 How latency varies based on workload pattern and tier

For example, for online banking, most of the latency typically occurs in the

database server, whereas the application server typically experiences the

greatest latency for online shopping and trading sites. Planning to resolve those

latencies from an infrastructure solution becomes very different as the latency

shifts.

The way applications manage traffic between tiers significantly affects the

distribution of latencies between the tiers, which suggests that careful analysis of

application architectures is an important part of this step and could lead to

reduced resource utilization and faster response times. Collect metrics for each

tier, and make user-behavior predictions for each change implemented.

WebSphere offers various facilities for monitoring the performance of an

application, as discussed in Chapter 14, “Server-side performance and analysis

tools” on page 769 and Chapter 15, “Development-side performance and

analysis tools” on page 839.

Users

Edge

Server

Web

Server

Application

Server

Database Server

and Legacy Systems

ES WS AS DB

End-to-end response time

Network latency Web site latency

Scope of performance management

26 54 19

5 27 53 15

Banking 5 11 23 61

Shopping

Trading

Examples of

percent of

latency

Edge Server

Application Server

Web Server

Database Server

Internet

44 WebSphere Application Server V6 Scalability and Performance Handbook

As requirements are analyzed for a new application, strive to build scaling

techniques into the infrastructure. Implementation of new applications offer the

opportunity to consider each component type, such as open interfaces and new

devices, and the potential to achieve unprecedented transaction rates. Each

scaling technique affects application design. Similarly, application design impacts

the effectiveness of the scaling technique. To achieve proper scale, application

design must consider potential architectural scaling effects. An iterative,

incremental approach will need to be followed in the absence of known workload

patterns.

2.2.2 Categorizing your workload

Knowing the workload pattern for a site determines where to focus scalability

efforts and which scaling techniques to apply. For example, a customer

self-service site such as an online bank needs to focus on transaction

performance and the scalability of databases that contain customer information

used across sessions. These considerations would not typically be significant for

a publish/subscribe site, where a user signs up for data to be sent to them,

usually via a mail message.

Workload patterns and Web site classifications

Web sites with similar workload patterns can be classified into site types.

Consider five distinct workload patterns and corresponding Web site

classifications:

Publish/subscribe (user to data)

Online shopping (user to online buying)

Customer self-service (user to business)

Online trading (user to business)

Business to business

Publish/subscribe (user to data)

Sample publish/subscribe sites include search engines, media sites such as

newspapers and magazines, as well as event sites such as sports

championships (for example the site for the Olympics).

Site content changes frequently, driving changes to page layouts. While search

traffic is low in volume, the number of unique items sought is high, resulting in the

largest number of page views of all site types. Security considerations are minor

compared to other site types. Data volatility is low. This site type processes the

fewest transactions and has little or no connection to legacy systems. Refer to

Table 2-1 on page 45 for details on these workload patterns.

Chapter 2. Infrastructure planning and design 45

Table 2-1 Publish/subscribe site workload pattern

Online shopping (user to online buying)

Sample sites include typical retail sites where users buy books, clothes, or even

cars. Site content can be relatively static, such as a parts catalog, or dynamic,

where items are frequently added and deleted, such as for promotions and

special discounts that come and go. Search traffic is heavier than on a

publish/subscribe site, although the number of unique items sought is not as

large. Data volatility is low. Transaction traffic is moderate to high, and almost

always grows.

When users buy, security requirements become significant and include privacy,

non-repudiation, integrity, authentication, and regulations. Shopping sites have

more connections to legacy systems, such as fulfillment systems, than

Workload pattern Publish/subscribe

Categories/examples Search engines

Media

Events

Content Dynamic change of the layout of a page, based

on changes in content, or need

Many page authors; page layout changes

frequently

High volume, non-user-specific access

Fairly static information sources

Security Low

Percent secure pages Low

Cross-session info No

Searches Structured by category

Totally dynamic

Low volume

Unique Items High

Data volatility Low

Volume of transactions Low

Legacy integration/ complexity Low

Page views High to very high

46 WebSphere Application Server V6 Scalability and Performance Handbook

publish/subscribe sites, but generally fewer than the other site types. This is

detailed in the following table.

Table 2-2 Online shopping site workload pattern

Customer self-service (user to business)

Sample sites include those used for home banking, tracking packages, and

making travel arrangements. Home banking customers typically review their

balances, transfer funds, and pay bills. Data comes largely from legacy

applications and often comes from multiple sources, thereby exposing data

consistency.

Security considerations are significant for home banking and purchasing travel

services, less so for other uses. Search traffic is low volume; transaction traffic is

Workload pattern Online shopping

Categories/examples Exact inventory

Inexact inventory

Content Catalog either flat (parts catalog) or dynamic

(items change frequently, near real time)

Few page authors and page layout changes

less frequently

User-specific information: user profiles with

data mining

Security Privacy

Non-repudiation

Integrity

Authentication

Regulations

Percent secure pages Medium

Cross-session info High

Searches Structured by category

Totally dynamic

High volume

Unique items Low to medium

Data volatility Low

Volume of transactions Moderate to high

Legacy integration/ complexity Medium

Page views Moderate to high

Chapter 2. Infrastructure planning and design 47

moderate, but growing rapidly. Refer to Table 2-3 for details on these workload

patterns.

Table 2-3 Customer self-service site workload pattern

Online trading (user to business)

Of all site types, trading sites have the most volatile content, the potential for the

highest transaction volumes (with significant swing), the most complex

transactions, and are extremely time sensitive. Auction sites are characterized by

highly dynamic bidding against items with predictable life times.

Trading sites are tightly connected to the legacy systems. Nearly all transactions

interact with the back-end servers. Security considerations are high, equivalent

Workload pattern Customer self-service

Categories/examples Home banking

Package tracking

Travel arrangements

Content Data is in legacy applications

Multiple data sources, requirement for

consistency

Security Privacy

Non-repudiation

Integrity

Authentication

Regulations

Percent secure pages Medium

Cross-session info Yes

Searches Structured by category

Low volume

Unique items Low

Data volatility Low

Volume of transactions Moderate and growing

Legacy integration/ complexity High

Page views Moderate to low

48 WebSphere Application Server V6 Scalability and Performance Handbook

to online shopping, with an even larger number of secure pages. Search traffic is

low volume. Refer to Table 2-4 for details on these workload patterns.

Table 2-4 Online trading site workload pattern

Business to business

These sites include dynamic programmatic links between arm’s length

businesses (where a trading partner agreement might be appropriate). One

business is able to discover another business with which it may want to initiate

transactions.

In supply chain management, for example, data comes largely from legacy

applications and often from multiple sources, thereby exposing data consistency.

Security requirements are equivalent to online shopping. Transaction volume is

moderate, but growing; transactions are typically complex, connecting multiple

Workload pattern Online trading

Categories/examples Online stock trading

Auctions

Content Extremely time sensitive

High volatility

Multiple suppliers, multiple consumers

Transactions are complex and interact with

back end

Security Privacy

Non-repudiation

Integrity

Authentication

Regulations

Percent secure pages High

Cross-session info Yes

Searches Structured by category

Low volume

Unique items Low to medium

Data volatility High

Volume of transactions High to very high (very large swings in volume)

Legacy integration/ complexity High

Page views Moderate to high

Chapter 2. Infrastructure planning and design 49

suppliers and distributors. Refer to Table 2-5 for details on these workload

patterns.

Table 2-5 Business-to-business site workload pattern

Workload characteristics

Your site type will become clear as it is evaluated using Table 2-6 on page 50 for

other characteristics related to transaction complexity, volume swings, data

volatility, security, and so on.

If the application has further characteristics that could potentially affect its

scalability, then add the extra characteristics to Table 2-6 on page 50.

Workload pattern Business to business

Categories/examples e-Procurement

Content Data is in legacy applications

Multiple data sources, requirement for

consistency

Transactions are complex

Security Privacy

Non-repudiation

Integrity

Authentication

Regulations

Percent secure pages Medium

Cross-session info Yes

Searches Structured by category

Low to moderate volume

Unique items Moderate

Data volatility Moderate

Volume of transactions Moderate to low

Legacy integration/ complexity High

Page views Moderate

50 WebSphere Application Server V6 Scalability and Performance Handbook

Table 2-6 Workload characteristics and Web site classifications

2.2.3 Determining the most affected components

This step involves mapping the most important site characteristics to each

component. Once again, from a scalability viewpoint, the key components of the

infrastructure are the load balancers, the Web application servers, security

services, transaction and data servers, and the network. Table 2-7 on page 51

specifies the significance of each workload characteristic to each component. As

seen in the table, the effect on each component is different for each workload

characteristic.

Note: All site types are considered to have high volumes of dynamic

transactions. This may or may not be a standard consideration when

performing sizing and site classification.

Workload characteristics Publish/

Online

shopping

Customer

selfservice

Online

trading

Business

business

Your

workload

Volume of user-specific

responses

Low Low Medium High Medium

Amount of cross-session

information

Low High High High High

Volume of dynamic searches Low High Low Low Medium

Transaction complexity Low Medium High High High

Transaction volume swing Low Medium Medium High High

Data volatility Low Low Low High Medium

Number of unique items High Medium Low Medium Medium

Number of page views High Medium Low Medium Medium

Percent secure pages

(privacy)

Low Medium Medium High High

Use of security

(authentication, integrity,

non-repudiation)

Low High High High High

Other characteristics High

Chapter 2. Infrastructure planning and design 51

Table 2-7 Load impacts on components

2.2.4 Selecting the scaling techniques to apply

The best efforts in collecting the information needed are worthwhile in order to

make the best possible scaling decisions. Only when the information gathering is

as complete as it can be is it time to consider matching scaling techniques to

components.

Manageability, security, and availability are critical factors in all design decisions.

Techniques that provide scalability but compromise any of these critical factors

cannot be used.

Here is a list of the eight scaling techniques:

Using a faster machine

Creating a cluster of machines

Using appliance servers

Segmenting the workload

Workload

characteristics

Web

present.

server

Web

app.

server

Network Security

servers

Firewalls

Existing

business

servers

Database

server

High % user-specific

responses

Low High Low Low Low High High

High % cross-session

information

Med High Low Low Low Low Med

High volume of

dynamic searches

High High Med Low Med Med High

High transaction

complexity

Low High Low Med Low High High

High transaction

volume swing

Med High Low Low Low High High

High data volatility High High Low Low Low Med High

High # unique items Low Med Low Low Low High High

High # page views High Low High Low High Low Low

High % secure pages

(privacy)

High Low Low High Low Low Low

High security High High Low High Low High Low

Sep 8, 2009

WAS Scalability and Performance

No comments:

Post a Comment

WebSphere Application Server - Express