HIGH
IMPACT: A Guide to Imperial Technology’s
MegaCache-4000
Implementation
Author:
Mohan Bhyravabhotla
Contributors:
Jun Song, Oracle Corporation
Robert Michael, Oracle Corporation
John Jory, Imperial Technology, Inc.
Ramon A. Sandoval, Imperial Technology, Inc.
Abstract
The
position that Imperial Technology's MegaCache-4000 in-line caching
system enhances I/O performance and improves the performance of an
OLTP database application server was investigated with a number of
objective tests developed by Oracle Corporation.
A
series of experiments were performed using a production-realistic
load simulation environment modeled after a real OLTP database
application. Performance
data was analyzed both with and without an in-line caching system to
identify significant variations.
Performance advantages of the cache controller were examined
and best practice recommendations studied.
It
was found that Imperial Technology's MegaCache-4000 in-line caching
system provided a significant performance advantage for the OLTP
application, plus it had the additional administrative flexibility
of allowing the cache memory to be flexibly configured with the
different database files.
Audience
Database
administrators and system administrators running Oracle 7.3.3 on
Dynix/ptx 4.4.1 or higher.
Disclaimer
This
paper is educational by nature; any concepts or techniques discussed
within this paper should be thoroughly tested prior to production
implementation. The
opinions expressed herein are those of the authors and do not
necessarily coincide with those of Oracle Corporation or Imperial
Technology, Inc.
Executive
Summary
Oracle
customers are very interested in database performance; therefore,
Oracle Corporation conducts tests of promising new technologies to
determine if they can provide advantages to its users.
As part of this effort, Oracle Corporation, together with
Imperial Technology, conducted a test of Imperial's MegaCache-4000
in-line caching system (a caching device that is installed between
the host computer and the disks or RAID units) to determine its
ability to improve system performance and scalability of high-end
Oracle database applications that place heavy I/O demand on the
peripheral storage devices.
The
results indicate significant value can be obtained when using the
MegaCache-4000. After
installing the product:
-
The
number of users could be increased from 100 to 900 without
increasing response time.
-
The
queue of I/O requests that must be managed by the CPU was
significantly reduced.
-
The
number of I/O requests continued to rise as the number of users
increased to 900 —
without
the MegaCache-4000 the I/O requests peaked with only 200 users.
For
the performance test the MegaCache-4000 was installed on a Sequent
SE60 with 16 Pentium® Processors and 1.5 gigabytes of memory,
running Dynix/ptx 4.4.1 Operating System and Oracle 7.3.3.
The MegaCache-4000 was configured with 6 UltraSCSI ports,
three ports attached to the host computer and three ports attached
to system disks that were mounted in three Sequent Pbay (Peripheral
bay) cabinets.
The
MegaCache-4000 product is suitable for installation on multi-user
systems running high I/O request rates.
It can improve database performance whether the present disk
storage is configured as individual disks — commonly known as
"JBOD" or "Just a Bunch of Disks" — or in a
RAID (Redundant Array of Inexpensive Disks, or Redundant Array of
Independent Disks) configuration.
Introduction
The
MegaCache-4000 is a hardware caching device installed between the
host computer and the string of disks or RAID units that are
directly connected to the host computer's I/O bus.
Because it is capable of caching any or all the disks or RAID
units on the bus or "string" it is known as a string-level
cache. It is also a
global cache because it does not allocate a portion of its cache
memory to individual disks or RAID units, rather it allows the
entire cache memory to be used by whatever disks or RAID units are
active.
After
installation the MegaCache-4000 is transparent to the host computer.
Performance improves as data, being written to or read from
the disks, is loaded into the MegaCache-4000’s memory.
It uses conventional DRAM (dynamic random-access memory) as
the cache buffer, with no mechanical latency in the access time,
resulting in extremely fast I/O performance.
Information is prefetched so data, which may be required by
the host, will already be resident in the cache to further enhance
performance. A feature
of this product is that a portion of the memory can be designated as
solid-state disk to further increase performance if certain files
are known to be “hot files.”
This feature of the MegaCache-4000 was not examined as part
of these tests.
The
purpose of the tests conducted was to measure the performance
advantages and scalability of the caching device in a database
application, and to evaluate the appropriate configuration settings
to ensure the safety of critical data while making optimum use of
the caching product.
The
MegaCache-4000 is easy to install and has a number of features that
make it suitable to work in the high availability system
environments. These
features are described in the section entitled
“Upgradeability/Maintenance.”
To
adequately test the effect of this product, two tests were
performed. The first
tests were conducted without any caching device installed and the
second set of tests were conducted with the MegaCache-4000 installed
on the system. Measurements
were taken to show the effect on CPU idle time, CPU wait time, the
number of users accommodated, and the response time of the reference
user.
Experimental
Design
Configuration-A
(Without the Caching Device)
This
configuration consisted of a Sequent SE-60 server with 16
Pentium® Processors, 1.5 gigabytes of memory and three UltraSCSI
controllers each connected to a Pbay (Peripheral bay) consisting of
ten 2-GByte drives. All
the drives in the Pbays were under the control of the Sequent Volume
Management. The
database under test was layered on 3-way striped raw volumes with a
stripe width of 64KB and each striped column connected to a
different controller for good performance.
The operating system consisted of a Dynix/ptx 4.4.1 Operating
System running Oracle 7.3.3. The
configuration is shown in Figure 1.
Figure
1: Configuration
A (Without the Caching Device)
Configuration
B (With the Caching Device)
This
configuration is similar to Configuration A except for the addition
of the MegaCache-4000 system provided by Imperial Technology
containing 2 GBytes of memory and six UltraSCSI ports.
Three ports of the MegaCache-4000 were connected to the host
computer UltraSCSI controllers and the other three ports were
connected to the Pbays. The MegaCache-4000 in-line caching system also contained
redundant power supplies, redundant batteries, dual AC inputs, plus
an internal disk and battery backup module to protect the cached
data. Configuration B
is shown in Figure 2.
Figure
2: Configuration
B (With the Caching Device)
MegaCache-4000
Product Description
The
MegaCache-4000 (referred to throughout this document as a caching
device) provides the ability to function as both a solid-state disk
and as a caching device to strings of conventional disks or RAID
units. It can be
configured with varying amounts of storage from 268 megabytes to
12.88 gigabytes and can be multiported with 2 to 6 ports per
chassis. The access
time to fetched cached data is 0.1 milliseconds, a hundred times
faster than that of a conventional rotating disk as described in the
product specification. Data
stored in a solid-state disk partition can be accessed twice as
fast, 0.05 milliseconds.
With
small I/O transfers, 512bytes to 16Kbytes, access time due to
mechanical latencies are 10 times longer than the time taken to
actually transmit the data. This
technology removes this latency, allowing the number of possible I/O
requests per second to increase from 85 to a few thousand.
The
MegaCache-4000 has three caching modes: Write Back Caching (Full
Cache), Write Through Caching, and Bypass.
When
in the Write Back Caching (Full Cache) mode, the MegaCache-4000
responds with a "command complete" status to the host as
soon as all data is received in cache. WRITE operations are completed to the system disks or RAID
units as a background task. The
greatest increase in system performance is achieved when both READ
and WRITE operations are cached.
When
in the Write Through Caching mode, both READ and WRITE operations
are cached, and the MegaCache-4000 does not respond with a
"command complete" status to the host until the data is
actually written to the target disk.
In
Bypass mode no caching takes place and all READ and WRITE operations
are passed through to the target disks.
The
user may independently choose any of the three caching modes for
each of the target disk strings that are cached by the
MegaCache-4000.
Performance
Advantage
To
determine the effect the MegaCache-4000 caching device has on a
system, a call tracking application was loaded on a test server with
a cache device installed. A
production-realistic load simulation environment was used (see
section entitled "Production-Realistic Load Simulation"
under "The Test Environment").
This test simulates the system performance of a call center
as the number of users is increased.
The test was designed to terminate upon reaching 1,000 users
or when a 40-second average customer response time was exceeded.
Without the caching device installed the maximum allowed
40-second response time was exceeded with just 100 users.
To provide additional comparison points, testing of the
system was continued until the number of users was increased to 350.
Testing could not be continued beyond this point because the
time the CPU spent waiting for the disk I/O block to complete was so
high the system was effectively saturated.
(The disk percentage busy was up to 82.5%).
Therefore, testing of this configuration was terminated.
With
the caching device installed, the number of users was increased
until the 40-second response time was exceeded.
This occurred with 900 users on the system.
A
reference user was established (this is explained in the load
simulation model under the section entitled "The Test
Environment") and data was collected on the system response
time to service this reference user.
In both system configurations this data was gathered as the
total number of users was increased in groups of 50 users at a time.
Note:
The following graphs were generated from iostat, sar,
simulated user scripts, and custom script reports run while
simulating the Oracle call tracking application. These reports gathered data in 2-minute intervals and
provide information on CPU, disk activity, and the database response
time.
Figure
3: Reference
User Response Time
Figure
3 shows the reference user response time with and without the
caching device installed. It plots the response time the reference user had to wait in
seconds as the number of users was increased.
The X-axis displays the total number of simulated users
logged on and performing SQL operations on the database such as
Select, Create, and Update. The
Y-axis displays the response time (in seconds) of the reference user
to complete the task.
The
separation between the two curves at any point along the X-axis
shows the advantage provided by the caching device.
When the caching device was added to the system, 900 users
could be serviced with a response time of the reference user
matching the response time of the system without the caching device
servicing only 50 users.
The
caching device was configured to cache both READ and WRITE
operations. This
provides the optimum response time to the application users for
database block read/write requests.
As shown in Figure 3, the response time is quite linear all
the way up to 900 simulated users with the response time changing
only from 20 to 40 seconds as the number of simulated users was
increased from 350 to 900.
The response time curve
without the caching device is not only non-linear but to some extent
logarithmic. This is
because the database block read/write requests are performing disk
I/O operations instead of operating out of cache.
Figure
4: Disk
Utilization
Figure
4 displays the percentage of time the operating system disk
subsystem was busy or had uncompleted I/O requests to the storage
subsystem. As the
number of users was increased, the demand for additional I/O
requests was easily supported by the system configured with the
caching device. The
disk utilization reached 42.2% with 900 simulated users.
Without the caching device, the disk subsystem was 82.5% busy
with 350 simulated users. With
the caching device, the disk utilization was 21.3% busy with 350
simulated users.
The
huge variation in the percentage disk busy is because, most of the
database read/write requests are being serviced by the rotating disk
and suffering the seek and rotational delays of the
electromechanical device when there is no caching device.
The requests are queued causing the average service time and
the percentage disk busy to increase rapidly as the number of
simulated users is increased. Without
the caching device the system cannot be scaled beyond 350 simulated
users. With the caching
device, the response time to read/write requests is within
acceptable limits at 900 simulated users.
Figure
5: Average
number of Reads/Writes per second
Figure
5 is a graph displaying the ability of the disk subsystem to support
additional I/O requests as the number of simulated users increases.
It can be seen that the disks, without the caching device,
can provide a maximum of approximately 85 I/O requests per second.
This peak I/O rate was reached with only 150 simulated users
on the system. With the
caching device configured, the number of completed I/O requests per
second increased as more simulated users were added.
Figure
6: Bandwidth
of the disk subsystem
Figure
6 is a graph showing the effective bandwidth of the disk subsystem
with and without the caching device installed.
The block size used in this test was 2KB.
As depicted in the graph,
without the caching device the average number of blocks per second
transferred reached its peak with only 200 simulated users on the
system. With the
caching device added, the average number of blocks per second
transferred continued to increase as more users were added.
With 350 simulated users on the system, the number of blocks
transferred was approximately 1450.
Figure
7: Percentage
Queue Reduction
Figure
7 is a graph displaying the percentage reduction in the I/O queue,
when the caching device was configured in the system.
Since managing the I/O requires CPU resources, any reduction
in queue length allows these resources to be available for other
tasks. Therefore, a
lower percentage is better.
CPU time can be divided
into four main categories: "%sys",
the percentage of time the CPU spends controlling the system;
"%usr", the percentage of time the CPU spends running the
user application; "%wio", the percentage of time the CPU
spends waiting for the disks to complete I/O requests, and
"%idle", the percentage of time the CPU is available to
perform additional tasks. The
following three graphs display comparison data before and after the
caching device was added.
Figure
8: CPU
Utilization - %sys
Figure
9: CPU
Utilization - %usr
Figure
10: CPU
Utilization - %wio
Note:
In Figure 10, CPU Utilization - %wio, a low percentage wait
time is desirable.
Figure
10 is a graph showing the comparative consumption of CPU resources
waiting for the disk drives to complete the I/O request.
Most of the CPU processing power is not being utilized when
the application was not using the caching device.
The CPU is spending much of its time waiting for the I/O
operation to complete. With the caching device installed, cache memory is acting as
a buffer making effective use of the CPU resources. The test results show the CPU utilization is 110% greater at
the 350-user level when the caching device is not present.
The
following table summarizes the data taken from Figures 8, 9 and 10
for 350 simulated users (the maximum number tested without the
caching device installed).
|
Utilization
|
Without
Cache
|
With
Cache
|
|
%sys
|
10.65%
|
10.36%
|
|
%usr
|
12.76%
|
22.72%
|
|
%wio
|
64.82%
|
8.55%
|
|
Subtotal
|
88.23%
|
8.55%
|
|
%idle
|
11.77%
|
58.37%
|
|
Total
|
100.00%
|
100.00%
|
Table
1: CPU Utilization
Table
1 shows that without the use of a caching device the idle time is
only 11.77%, which is the percentage of time the CPU is available to
perform additional tasks. By introduction of a caching device, the percentage
idle is increased to 58.37%.
The
addition of a MegaCache-4000 in-line caching device can provide a
considerable performance boost to a system that has a high I/O
demand. This is true
from both the perspective of the computer system's resources and
from the individual user.
Recommendation/Conclusion
The
results of the testing conducted at Oracle Large Systems
Support-Belmont Center show Imperial Technology's MegaCache-4000
in-line caching system can provide a significant throughput
improvement. In
applications where performance is limited by high I/O rate, this
technology is a valid, quick, reliable solution.
Good
practices suggest that critical, non-recoverable elements of a
database, such as the redo log volumes, should be protected against
failures by immediately writing them to non-volatile storage and not
having this data cached. The
flexible configuration of the MegaCache-4000 allows the operator to
set write-through caching for selected files, while allowing other
files to get the full performance benefit of full read and write
caching.
Upgradability/Maintenance
The
MegaCache-4000 can be configured in capacities ranging from 268
megabytes to 12.88 gigabytes and with 2, 4 or 6 ports.
If a MegaCache-4000 system is installed with less than the
maximum capacity, it can be upgraded by simply adding storage
modules. In addition,
if a unit is installed with less than the maximum 6 ports,
additional controller boards can be added.
Each controller contains 2 ports.
The
product is well designed to provide a high degree of fault tolerance
and eliminate failures that could result in a system failure.
This is accomplished with extensive error detection and
correction (EDAC) in the memory and by redundancy elsewhere in the
product. A brief
description of these elements is listed below.
Error
Detection and Correction (EDAC) Circuitry
The
MegaCache-4000 uses DRAM (dynamic random-access memory) devices for
data storage and a powerful proprietary Reed-Solomon Code for
correcting any rare errors that might occur.
With this design, stored data is very safe and reliable.
It is much more powerful than Hamming Code, the conventional
memory error correction technique.
Hamming
codes, used in most memory designs, correct only a single-bit error
in a word group. The MegaCache-4000 uses a powerful proprietary Reed-Solomon
error correction code that can detect and correct as many as 6 whole
bytes (24 bits) within a 64-byte data group without affecting system
availability or performance. This
is the most powerful error detection and correction circuitry code
offered on any product today, and it is used throughout Imperial
Technology's product lines.
Power
System
The
power system is completely redundant and contains its own internal
UPS (uninterruptible power supply) system.
There are two AC (alternating current) inputs allowing the
unit to be powered by two separate AC circuits.
If one input AC power source is removed, the MegaCache-4000
system is unaffected. The
+5VDC and +12VDC power supplies are also redundant.
In normal operation, the load is shared to reduce
temperatures and further enhance reliability. The failure of any one DC (direct current) power supply does
not affect any system operation.
Since
the DRAM devices used to store data are inherently volatile, the
user is assured data integrity against AC input power failure
because of the built-in UPS system. The UPS system employs redundant batteries that are both
separately charged and constantly tested by built-in patrol
diagnostics that operate in background mode.
Data
Protection on Power Failure
If
the system is turned off or if there is a failure of both AC inputs,
data stored in the DRAM memory is preserved with the redundant NiCad
battery system allowing all data to be completely and safely stored
to the MegaCache-4000's conventional internal disk backup unit.
All the data written to the MegaCache-4000 but not yet
written to the system disks is copied to this internal disk using
battery power. Each
battery has its own separate charging circuit that is automatically
tested by the patrol diagnostic facility at all times.
When power is restored, the data is automatically returned to
the cache memory and copied out to the system disks.
The
Test Environment
The
test environment was set up at Oracle Large Systems Support-Belmont
Center. An internal
call tracking system on Oracle 7.3.3 RDBMS was used as a test
application. The test
environment consisted of three major components: virtual load
simulation, database server, and network.
The virtual load simulation was done from the HP-K400 server
and a Sequent SE60 was used as the database application server.
Both were
connected over a 10BASE5 Ethernet working at 10Mbits/sec.
Production-Realistic
Load Simulation
The
ability to generate a realistic load is the most important step to
obtaining meaningful performance data. It is not practical to have
hundreds of live users working at hundreds of
NCs (network computers)
to generate the load. It
is also not feasible to use a terminal emulation tool for user
activity, because of the requirement to provide a large quantity of
client hardware. Our
approach was to simulate the client at the network level.
A third-party tool capable of simulating the user load
through Oracle OCI layer was used to complete the task.
This tool has the ability to record the network traffic
between the client and the database application server for specific
transactions. When the
traffic is replayed multiple times simultaneously, from the
application server's point of view, it looks as though multiple
users are accessing the server.
Since the system under test is the database application
server, this virtual client serves our purpose.
The simulation model was built with a controlled load to the
server, while maintaining a realistic reproduction of transactions
actual users perform.
The
successful production realistic simulation was achieved by:
Defining
the user profile similar to Oracle’s on-line call
tracking system, a production application within Oracle Corporation.
Creating
the model: This
consisted of two parts: transaction script files and a workload
description file (WDF). The production realistic on-line transactions (SQL
statements) were turned into script files that generate the same
sequence of SQL statements that a real client would.
The preVue product has the ability to capture and interpret
the Oracle SQL*Net traffic between a single user and Oracle RDBMS
server. It then places
the results in a script, which can be played back against the
server, under control of the tool.
The work load description file (WDF) defines each user in a
user profile by specifying the mix of transactions each user will
perform.
There
was a user designated as the “reference user” defined in the
simulation model. The reference user is defined to be more sensitive to
anticipated key resource contention than a real user.
The reference pseudo user briefly but intensely accesses in a
read-only pattern the same resources that real users do.
Executing
the test: This
was accomplished with the preVue interface that controls the
execution of the simulation tests.
While the test is running, the application server resources
are closely monitored at the RDBMS and host OS levels.
Analyzing
the results: The
performance data was collected by use of the ‘sar’ (system
activity report) utility and custom scripts at 2-minute intervals.
The data samples were imported into Excel spreadsheets and
averaged to 30-minute intervals.
Response times for the reference user were plotted against
the simulated load.
Software
Version
preVue
5.0 from Rational Software.
Hardware
Configuration
HP-K400
with 4 CPUs, 1GB of memory, running with HP-UX 10.20.
Application
Database Server
The
production-realistic load simulation was conducted on a database
built from the export of a production call tracking application
database. The database
runs on Oracle RDBMS 7.3.3.
The
database was created on the raw volumes.
Each volume was striped across three spindles, and each
spindle was from a different Pbay connected to an independent SCSI
controller on the host computer for optimum performance.
The size of the database was 3 gigabytes.
The system selected had the required memory for the SGA and
enough processing power for our transaction needs.
Software
Version
Oracle
7.3.3
Hardware
Configuration
Sequent
SE60 with 16x 66MHz Pentium® Processors, 1.5GB Memory, 3x SCSI
controllers each connected to one Pbay with 10x 2GB spindles,
running with Dynix/ptx 4.4.1 Operating System.
Network
Both
the load simulation server and the database application server were
connected to the corporation network over a 10BASE5 network.
Acknowledgements
Several
members of the LSS team were instrumental in the success of the
MegaCache-4000 Implementation project.
Teri
Bommarito: Due to
her excellent editing skills, Teri played a key role in ensuring
that the important messages and structure of this document were of
the highest standard.
Basab
Maulik: Basab provided
valuable assistance in technical editing and presentation of data in
this paper.
Bill
Cowden: Bill provided guidance and commentary on the structure and
flow of the document along with presenting technical data.
Darryl
Presley: Darryl's
knowledge of Oracle technology helped verify the conclusions reached
in this paper.
More
Information From Oracle
To receive more
information regarding the "HIGH IMPACT: A Guide to Imperial
Technology's MegaCache-4000 Implementation" study, please
contact:
| Large
Systems Support |
Tel:
(650) 506-2952 |
| Oracle
Corporation |
Fax:
(650) 506-7584 |
| 20
Davis Drive |
Web
Site: www.oracle.com/support/lss/ |
| Belmont,
CA 94002 |
|
Oracle Large
Systems Support (LSS) is the premium service of Oracle Support
Services devoted to mitigating the risk in implementing unproven
system configurations. The
LSS initiative is a partnership between the service divisions at
Oracle, enterprise systems platform vendors, and Oracle Business
Alliance Partners. The
LSS mission is to proactively ensure customers’ large enterprise
systems are reliable and supportable.
More
Information From Imperial Technology
To receive more
information regarding Imperial Technology's MegaCache®-4000 in-line
caching system or any of the MegaRam® Solid-State Disk products,
please contact:
| Craig
Harries |
Tel:
(310) 536-0018 or (800) 451-0666 |
| Imperial
Technology, Inc. |
Fax: (310) 536-0124 |
| 2305
Utah Avenue |
Web
Site: http://www.imperialtech.com |
| El
Segundo, CA 90245 |
|
Get
PDF File
|