![]() |
Providing News & Information For Data
Intensive Storage Solutions For The Enterprise |
|||
|
||||
Features - Enterprise Data Insights:BROADENING THE BANDWIDTH: STORAGE CLUSTERS IN GOV'T SCIENCESScientists at government agencies and academic laboratories address many of our nation's most pressing problems, including challenges in energy research, environmental science, global climate change, earthquake simulation, and intelligence and homeland security. Each of these applications requires the most powerful computing solutions available. Increasingly, the systems used to explore these issues are massively parallel Linux clusters. These systems have the ability to acquire, process and generate large amounts of data, and they place increasingly strenuous demands on existing storage architectures. Object-based storage is a new architecture that complements the highly parallel nature of the compute cluster. Designed to deliver the performance and scalability required by these applications, it is changing the shape of scientific high performance computing and expanding our ability to meet these ever expanding challenges. High energy physics experiments collect petabytes of data from particle colliders, giving us insight into the nature of matter and our universe. Global climate change simulations provide a view into the future, shedding light on the impact of our interaction with our environment. The Accelerated Strategic Computing Initiative (ASCI), ensures the performance, readiness and safety of our nation's nuclear stockpile through large-scale simulations -- eliminating the need for nuclear testing. And data-intensive surveillance systems form the core of our intelligence network. All of these applications are recognized for their computational complexity. They are all equally voracious in their appetite for high performance storage. Rapid access to shared datasets, often multiple terabytes in size, is critical for ensuring optimal utilization of valuable compute cluster assets. These datasets need to be made globally available to all processes executing on the compute cluster in order to simplify development and system management activities. Traditional networked storage systems are incapable of providing the necessary performance to serve the aggressive shared access requirements of these applications. Fortunately, a new networked storage architecture, called object-based storage, is emerging to fill the expanding gap -- promising high bandwidth parallel access to thousands of clients over standard TCP/IP infrastructures. It is a solution in which the storage system's scalability and price-performance can be perfectly matched to that of the cluster computer. Together, these approaches deliver commodity supercomputers able to keep pace with increasingly aggressive -- and important -- applications. To better understand the need for this new approach to scalable storage, we first explore the manner in which many cluster computing applications have attempted to address the storage bottleneck. Most of these applications utilize what is commonly referred to as a scale-out or shared nothing approach to parallel computing. In this model, applications employ a "divide-and-conquer" approach -- decomposing the problem to be solved into thousands of independently executed tasks. The most common decomposition approach exploits a problem's inherent data parallelism -- breaking the problem into pieces by identifying the data subsets, or partitions, that comprise the individual task, then distributing each task and the corresponding data partitions to the compute nodes for processing. For example, climate change simulations decompose the problem into thousands of adjacent spatial regions or "cells." Each of the cells executes a local environment model, updating its state based on local interactions with adjacent cells and additional local or global inputs. Simulations may run for millions of iterations (time steps), with results captured along the way. The result is a multi-terabyte "stack" of simulation results that can be further refined, analyzed, and visualized to assess the model's results. Although the individual processor contributions to the output may be modest, the aggregate I/O generated by large clusters -- often many thousands of nodes -- is appreciable -- up to tens of gigabytes per second. Additionally, subsequent analysis and visualization of simulation results is also compute and data-intensive, requiring multi-gigabyte per second bandwidth. The desire of cluster computing developers is to deploy a shared storage solution to house these large datasets, one that can be accessed by all nodes in the cluster. Such a solution greatly simplifies management of the compute jobs, as all data partitions and replicas can be made available to all nodes, and hence any of the tasks can be computed on any node. Additionally, the output of these jobs can then be used directly elsewhere -- in post-processing, visualization or even as the input to the next processing task in a computational pipeline. Unfortunately, standard shared storage solutions provided by NFS file servers built from direct attached storage (DAS) are only sufficient for small clusters. Larger clusters require more scalable storage solutions. Storage area networks (SANs) and optimized Network attached storage (NAS) architectures have been employed for modest-sized clusters, however, these architectures have severe limitations as clusters become larger. In particular, neither SAN nor NAS architectures support the aggressive concurrency and high per-client throughput requirements of these cluster computing applications. SANs were designed to provide a modest number of application servers with high performance, highly reliable access to a shared pool of storage devices -- e.g. for enterprise transactional databases. SANs afford leverage over the storage provisioning process -- allowing disks to be moved among application servers to readily address changes in capacity requirements, but leading to application-server-based islands of data. NAS systems, on the other hand, were designed to afford widespread data sharing on heterogeneous platforms, with relatively modest per-user I/O requirements -- e.g. for user home directory storage. And while there are convergence product offerings that front SAN-based storage with NAS "heads," these products tend to be performance-limited and relatively expensive -- in terms of both capitalization and maintenance costs. Because of these limitations, organizations are forced to adopt a process in which data from a shared storage system is staged (copied) to the compute nodes, processing is performed, and results are destaged from the nodes back to shared storage when done. In many applications, the staging and destaging time can be appreciable -- up to several hours for large clusters. Fortunately for the growing community of cluster computer users, a new storage architecture is emerging. Object-based storage is the foundation for building massively parallel storage systems that leverage commodity processing, networking, and storage components to deliver unprecedented scalability and aggregate throughput in a cost-effective and manageable package. At the core of this architecture are storage objects, fundamental containers that house both application data and an extensible set of storage attributes. Traditional user and application files are decomposed into a set of storage objects and distributed across one or more "smart disks," called Object-based Storage Devices (OSDs). Each OSD includes local processing capabilities, local memory for data and attribute caching, and its own network connection. OSDs form the core of a distributed storage architecture in which much of the traditional storage allocation activity can be offloaded from the file system layer -- removing a key performance bottleneck present in current storage systems. Object attributes include security information and usage statistics useful for enforcing credential-based access and Quality of Service policies, and supporting dynamic data redistribution for cross-OSD load balancing. The object storage architecture mirrors the scale-out architecture of cluster computing systems, providing a balanced growth model that adds network bandwidth and processing capability in step with capacity increments to ensure scalability. A standard for Object Storage Devices is being defined by technical working groups within the Storage Networking Industry Association (SNIA) and the ANSI T10 Technical Committee. The standard includes a command set designed for the iSCSI protocol -- in essence providing object extensions to the traditional SCSI block command set. Together, the object specification and command set define a new wave of intelligent storage devices that can be integrated into massively parallel, high performance, IP-based storage environments. The effort has the participation of many leading storage companies, including EMC, HP, IBM, Intel, Seagate and Veritas. While the object-based architecture is an essential foundation for enabling the development of massively parallel storage architectures, by itself, it does not comprise a storage system. To realize the benefits afforded by objects, a scalable file-metadata management layer must also be developed. This layer manages information such as directory membership and file ownership and permission attributes. It also is responsible for striping "component objects" (portions of files) across OSDs and ensuring data reliability and availability -- for example, by coordinating backup and online redundant encoding, such as RAID levels 1 or 5. This is the layer through which client processes make requests (e.g., to open or close files), are authenticated, and receive authorization credentials and a map of object locations and their host OSDs. The client then uses the map and supplied credentials to directly and securely access the cluster of OSDs, reading and writing file data without additional intervention by the metadata manager. The result is a highly parallel out-of-band data access path that simultaneously supports requests from hundreds of clients enabling highly scalable applications employing clustered computing approaches. |
||||
| | Table of Contents | |