Technology Industry
Industry: Email Alert RSS FeedOvercome the input/output pitfalls: centralized storage caching can accelerate operations by delivering real-time response
Communications News, Jan, 2008 by Gary Orenstein
While consolidated server deployments in a virtualized server environment can crank up utilization rates, this practice also allows large numbers of clients (real and virtual) to access data simultaneously on a single storage system. This can negatively impact corresponding input/output operations-per-second (IOPS) performance, especially for conventional storage systems based on slow mechanical disks. Real-time application performance in latency-sensitive enterprises can be compromised, thereby nullifying some of the advantages provided by virtualization.
[ILLUSTRATION OMITTED]
- Most Popular Articles in Technology
- An overview of continuous data protection
- Why all those current ratings?
- Many countries now have a mobile penetration rate above 100%, report says
- The Tata Group's big telecom gamble: VSNL's recent acquisition of Tyco ...
- MEASURING BANK BRANCH EFFICIENCY USING DATA ENVELOPMENT ANALYSIS: MANAGERIAL ...
- More »
One way to help avoid the "input/output (I/O) trap" of server virtualization technologies is available through centralized storage caching. This approach makes use of high-capacity, high-speed cache memory that is shared as a network resource, serving I/O-intensive requests directly from cache. This can accelerate I/O operations by delivering real-time response and higher IOPS.
While there is a perceived performance advantage for Fibre Channel solutions, many customers enjoy the flexibility and manageability of network file systems (NFS) and have considered NFS alternatives primarily due to performance concerns. Centralized storage caching can alleviate those concerns.
Application servers in the previrtualization era were sometimes I/O constrained, but the constraints were relatively easy to identify. For example, since each server typically hosted one primary application, measuring the I/O load on a per-server basis was easy. Also, the total number of servers and applications accessing a single storage system was limited due to logical and physical constraints.
While I/O imbalances still occurred, administrators had a relatively simple way of pinpointing loads and redistributing them to alternate storage subsystems, as necessary. Predictable I/O performance existed due to a non-virtual, segmented and fairly consistent workload. The end result is that administrators could create server and storage system combinations that operated efficiently.
With server virtualization deployments, many virtual machines and their corresponding applications share a single I/O connection, such as mounting a network-attached storage (NAS) file system or Fibre Channel logical unit number (FC LUN). In these cases, isolating the biggest I/O driver in the configuration becomes more difficult.
Furthermore, since consolidation remains a primary goal and benefit of virtualization, customers often end up deploying many virtual machines that connect to the same storage system. This combination often results in I/O contention that was absent in the previrtualization stage, and is more difficult to identify and resolve. The result is application volatility due to virtualized, shared and largely unpredictable workloads.
ADDING LOADS TO THE SYSTEM
Within the storage system, contention occurs when multiple I/O streams are competing for the same resource, namely mechanical disks. Each I/O stream, or, in this case, each additional virtual machine, can place excess I/O load on the storage system. Therefore, the simplicity by which an IT manager can deploy a new virtual machine can also lead to a potentially dangerous byproduct of I/O contention. Access patterns of the applications can also determine overall I/O load; when more applications are added, the workload becomes more unpredictable, exacerbating the impact of contention.
The most conventional approach to improving I/O performance is to add more disk spindles. While this may provide a temporary increase in I/O capabilities, the underlying problem of more CPUs accessing slower mechanical disks has been masked, not cured. The same is true for selecting a drive type that has the highest performance characteristics, and also the highest cost. In addition, modifying disk spindle count or drive type typically involves some type of data migration, a manual process that can be time consuming.
Performance considerations remain paramount for users deploying virtualization. With virtualization still in early stages at many companies, the performance concerns have yet to be fully characterized. Broadly classifying performance in virtual environments can include one or all of the following criteria:
* number of virtual machines per server;
* maximum number of virtual machines that can exist on a single physical server;
* total number of virtual machines per environment;
* maximum number of virtual machines per environment;
* virtual machine/application performance in steady state;
* the peak and sustainable application workload;
* virtual machine time to steady state (e.g., boot time);
* the time a virtual machine takes to reach peak performance; and
* I/O improvements through caching.
Centralized storage caching provides a means to boost I/O performance without having to deploy excessive numbers of disk drives, or manually migrate data from one storage device to another. A scalable caching appliance can be placed in the network to serve data from high-speed memory as opposed to disk. This can improve response time and IOPS without disrupting the existing storage infrastructure.
