
Features - Enterprise Data Insights:
NEXT-GENERATION CLUSTERING COMES TO WINDOWS DATA CENTER By Steve
Norall, PolyServe Inc
A typical Windows data center is beset by unmanageable numbers of standalone
servers, active-passive server pairs and storage points, no or limited
high-availability (HA) protection for critical applications and an inability
to easily share data. Not surprising, many Windows enterprises are awakening
to the underlying cost and complexity produced by the sprawl of servers and
their associated storage and software, and are looking to regain control of
their data centers.
An emerging option for these enterprises is a technology solution new to the
Windows data center known as shared data clustering. This solution provides
total cost of ownership (TCO), manageability and fault-tolerance benefits
previously available only to Linux and UNIX enterprises.
Shared data clustering is distinctly different from the high-availability
clustering software common among Windows data centers. The latter is limited
to providing failover protection, usually in clusters of two to four servers
in active-passive configurations. While HA protection is valuable, it can
exacerbate the costly server buildup plaguing many data centers by adding
numerous, under-utilized backup servers.
Shared data clustering, for the first time in the Windows data center,
provides both high availability and the ability for clustered servers to read
and write to the same shared storage and scale out horizontally as demand
dictates. It's now possible to build flexible, scalable on-demand clusters of
up to 16 Windows-based servers that have direct, simultaneous access to file
systems in a storage area network (SAN).
This new generation of Windows clustering has been achieved through several
distributed computing breakthroughs. Chief among them is a fully distributed,
fully journaled cluster file system (CFS) that supports online additions and
deletions of nodes and concurrent multi-node access to shared data. CFS
solutions have been previously available only for Linux and UNIX. Another key
technology development is a completely symmetric, distributed lock manager
that eliminates any single points of failure clusterwide and the single server
bottleneck on file operations.
Shared Data Clustering For Consolidation
The shared data clustering approach has major implications for data center
consolidation projects aimed at reducing cost and complexity for widely
deployed applications such as SQL Server, file serving and Web content
serving. Shared data clustering reduces costs across all four of the main TCO
components: operating/staffing, capital, maintenance support and downtime.
Shared data clustering allows administrators to consolidate under-utilized
servers and application onto a single fault-tolerant, modular, scalable
cluster. This reduces operational costs by cutting the number of managed
servers nearly in half, and simplifies server and application management by
allowing IT to manage many nodes as one from a single, integrated control
console.
By maximizing the use of existing servers, capital costs are reduced due to
less need to buy more hardware. It's also unnecessary to purchase new, larger
servers to achieve consolidation. Shared data clustering enables horizontal
scaling -- or "scale out" -- using smaller low-cost Windows servers versus a
"scale up" approach using large, expensive multi-processor boxes. Only with a
cluster file system is it possible to manage the servers and their data as a
single managed entity and, with file-based applications, achieve performance
that equals or exceeds larger servers -- at a much lower cost.
Shared data clustering also decreases maintenance costs for servers and
storage. Smaller servers are less costly to support than larger servers, and
since servers that share data require less storage, fewer storage entities are
required. The result is less maintenance paid on storage.
The fourth TCO component -- downtime -- can be reduced beyond the capability
of traditional HA clusters. A shared data cluster far exceeds an HA
infrastructure dependent upon small clusters of active-passive pairs. In a
shared data cluster all servers provide backup for each other. A 16-node
cluster configuration represents a 15x improvement in tolerance to server
failures versus a single active-passive pair. Even when multiple nodes fail in
a cluster, continuous availability is maintained.
Application Candidates For Shared Data Access
With all clustered Windows servers given direct concurrent access to shared
data, administrators can scale workloads across multiple servers. Using a
high-speed symmetric Distributed Lock Manager (DLM), shared data clustering
also provides complete cache coherence across the cluster. Without direct
concurrent access to shared data, it's not reasonable for multiple servers to
work on the same application and data at the same time.
Not all Windows-based applications are structured identically to take
advantage of shared data access. Three key applications that derive
significant benefits from a shared data architecture are file server, SQL
Server and Web servers.
File serving is the most common application of Windows servers. Most
corporations have hundreds, if not thousands, of file servers. Data centers
struggle to manage large quantities of both NT4.0 and Windows 2000 file
servers in a coherent, simple way. They also have many distributed file
servers (or active-passive pairs), each with their own storage, resulting in
isolated data silos that need to be backed up and managed. Many administrators
are now demanding better utilization of servers and question the drain on the
bottom line of "passive" or "backup" servers.
Besides solving each of these file-serving concerns, shared data clustering
represents a strong alternative to network-attached storage (NAS) appliances
or filers. A small group of file servers harnessed by a cluster file system
becomes a powerful NAS cluster whose performance can easily exceed a high-end
NAS appliance at a fraction of the cost.
A consolidated, shared data infrastructure for SQL Server offers substantial
cost savings and management economies over existing deployment techniques. By
consolidating multiple SQL Server instances onto a shared data cluster, the
management of the cluster is dramatically reduced, while all SQL Server
instances benefit from high-availability protection. Also, storage management
is greatly simplified. Storage can be provisioned centrally and a single node
in the cluster can back up all data clusterwide.
Web serving with Microsoft Internet Information Server (IIS) is a natural fit
for scale-out architectures. A shared data clustering implementation offers a
simplified way to manage Web content through a shared file system and data
store. Web administrators can centralize all Web content into a single,
easy-to-manage repository where one change to data or content is immediately
available and current across all Web servers. This eliminates the error-prone
steps of copying or replicating data from node to node.
Summing Up
Shared data clustering with high availability can be the foundation for
implementing or improving scale-out implementations. By providing direct
concurrent access to shared data from multiple Windows servers, this new
generation of clustering provides the basis to scale workloads across multiple
servers. Any Windows IT organization or department leader considering server,
storage, software or staffing consolidation should consider its potential for
delivering a more manageable, available and affordable Windows computing
infrastructure.
|