DSstar Logo Providing News & Information For Data Intensive
Storage Solutions For The Enterprise

  |  Table of Contents  |  

Features - Enterprise Data Insights:

NEXT-GENERATION CLUSTERING COMES TO WINDOWS DATA CENTER
By Steve Norall, PolyServe Inc

A typical Windows data center is beset by unmanageable numbers of standalone servers, active-passive server pairs and storage points, no or limited high-availability (HA) protection for critical applications and an inability to easily share data. Not surprising, many Windows enterprises are awakening to the underlying cost and complexity produced by the sprawl of servers and their associated storage and software, and are looking to regain control of their data centers.

An emerging option for these enterprises is a technology solution new to the Windows data center known as shared data clustering. This solution provides total cost of ownership (TCO), manageability and fault-tolerance benefits previously available only to Linux and UNIX enterprises.

Shared data clustering is distinctly different from the high-availability clustering software common among Windows data centers. The latter is limited to providing failover protection, usually in clusters of two to four servers in active-passive configurations. While HA protection is valuable, it can exacerbate the costly server buildup plaguing many data centers by adding numerous, under-utilized backup servers.

Shared data clustering, for the first time in the Windows data center, provides both high availability and the ability for clustered servers to read and write to the same shared storage and scale out horizontally as demand dictates. It's now possible to build flexible, scalable on-demand clusters of up to 16 Windows-based servers that have direct, simultaneous access to file systems in a storage area network (SAN).

This new generation of Windows clustering has been achieved through several distributed computing breakthroughs. Chief among them is a fully distributed, fully journaled cluster file system (CFS) that supports online additions and deletions of nodes and concurrent multi-node access to shared data. CFS solutions have been previously available only for Linux and UNIX. Another key technology development is a completely symmetric, distributed lock manager that eliminates any single points of failure clusterwide and the single server bottleneck on file operations.

Shared Data Clustering For Consolidation

The shared data clustering approach has major implications for data center consolidation projects aimed at reducing cost and complexity for widely deployed applications such as SQL Server, file serving and Web content serving. Shared data clustering reduces costs across all four of the main TCO components: operating/staffing, capital, maintenance support and downtime.

Shared data clustering allows administrators to consolidate under-utilized servers and application onto a single fault-tolerant, modular, scalable cluster. This reduces operational costs by cutting the number of managed servers nearly in half, and simplifies server and application management by allowing IT to manage many nodes as one from a single, integrated control console.

By maximizing the use of existing servers, capital costs are reduced due to less need to buy more hardware. It's also unnecessary to purchase new, larger servers to achieve consolidation. Shared data clustering enables horizontal scaling -- or "scale out" -- using smaller low-cost Windows servers versus a "scale up" approach using large, expensive multi-processor boxes. Only with a cluster file system is it possible to manage the servers and their data as a single managed entity and, with file-based applications, achieve performance that equals or exceeds larger servers -- at a much lower cost.

Shared data clustering also decreases maintenance costs for servers and storage. Smaller servers are less costly to support than larger servers, and since servers that share data require less storage, fewer storage entities are required. The result is less maintenance paid on storage.

The fourth TCO component -- downtime -- can be reduced beyond the capability of traditional HA clusters. A shared data cluster far exceeds an HA infrastructure dependent upon small clusters of active-passive pairs. In a shared data cluster all servers provide backup for each other. A 16-node cluster configuration represents a 15x improvement in tolerance to server failures versus a single active-passive pair. Even when multiple nodes fail in a cluster, continuous availability is maintained.

Application Candidates For Shared Data Access

With all clustered Windows servers given direct concurrent access to shared data, administrators can scale workloads across multiple servers. Using a high-speed symmetric Distributed Lock Manager (DLM), shared data clustering also provides complete cache coherence across the cluster. Without direct concurrent access to shared data, it's not reasonable for multiple servers to work on the same application and data at the same time.

Not all Windows-based applications are structured identically to take advantage of shared data access. Three key applications that derive significant benefits from a shared data architecture are file server, SQL Server and Web servers.

File serving is the most common application of Windows servers. Most corporations have hundreds, if not thousands, of file servers. Data centers struggle to manage large quantities of both NT4.0 and Windows 2000 file servers in a coherent, simple way. They also have many distributed file servers (or active-passive pairs), each with their own storage, resulting in isolated data silos that need to be backed up and managed. Many administrators are now demanding better utilization of servers and question the drain on the bottom line of "passive" or "backup" servers.

Besides solving each of these file-serving concerns, shared data clustering represents a strong alternative to network-attached storage (NAS) appliances or filers. A small group of file servers harnessed by a cluster file system becomes a powerful NAS cluster whose performance can easily exceed a high-end NAS appliance at a fraction of the cost.

A consolidated, shared data infrastructure for SQL Server offers substantial cost savings and management economies over existing deployment techniques. By consolidating multiple SQL Server instances onto a shared data cluster, the management of the cluster is dramatically reduced, while all SQL Server instances benefit from high-availability protection. Also, storage management is greatly simplified. Storage can be provisioned centrally and a single node in the cluster can back up all data clusterwide.

Web serving with Microsoft Internet Information Server (IIS) is a natural fit for scale-out architectures. A shared data clustering implementation offers a simplified way to manage Web content through a shared file system and data store. Web administrators can centralize all Web content into a single, easy-to-manage repository where one change to data or content is immediately available and current across all Web servers. This eliminates the error-prone steps of copying or replicating data from node to node.

Summing Up

Shared data clustering with high availability can be the foundation for implementing or improving scale-out implementations. By providing direct concurrent access to shared data from multiple Windows servers, this new generation of clustering provides the basis to scale workloads across multiple servers. Any Windows IT organization or department leader considering server, storage, software or staffing consolidation should consider its potential for delivering a more manageable, available and affordable Windows computing infrastructure.


Top of Page


  |  Table of Contents  |