5 Jan 2018

There is no shortage of solutions available to ensure minimal disruption if a server fails or you have to recover from a cyber attack.

Servers do not prevent data loss

A standard x86-based server typically stores data on RAID (Redundant Arrays of Independent Disks) storage devices. The capabilities of x86 servers range from vendor to vendor and support a variety of operating systems and processors. However, a standard x86 server may have only basic backup, data-replication, and failover procedures in place, which means it would be susceptible to catastrophic server failures.


A standard server is not designed to prevent downtime or data loss. In the event of a crash, the server stops all processing and users lose access to their applications and information, so data loss is likely. Standard servers do not provide protection for data in transit, which means if the server goes down, this data is also lost. Though a standard x86 server does not come from its vendor as highly available, there is always the option to add availability software following initial deployment and installation.

Automatic take-overs with server clusters

Traditional high-availability solutions which can bring a system back up quickly are typically based on server clustering: Two or more servers that are running with the same configuration and are connected with cluster software to keep the application data updated on both/all servers.Standard servers do not provide protection for data in transit, which means if the server goes down, this data is also lost

Servers (nodes) in a high-availability cluster communicate with each other by continually checking for a heartbeat which confirms other servers in the cluster are up and running. If a server fails, another server in the cluster, designated as the failover server, will automatically take over, ideally with minimal disruption to users.

Computers in a cluster are connected by a local area network (LAN) or a wide area network (WAN) and are managed by cluster software. Failover clusters require a storage area network (SAN) to provide the shared access to data required to enable failover capabilities. This means that dedicated shared storage or redundant connections to the corporate SAN, are also necessary.

Continuous administrative oversight

While high-availability clusters improve availability, their effectiveness is highly dependent on the skills of specialised IT personnel. Clusters can be complex and time-consuming to deploy and they require programming, testing, and continuous administrative oversight. As a result, the total cost of ownership is often high.

It is also important to note that downtime is not eliminated with high-availability clusters. In the event of a server failure, all users who are currently connected to that server lose their connections. Therefore, data not yet written to the database is lost.

Fault-tolerant solutions are also referred to as continuous availability solutions. A fault-tolerant server provides the highest availability because it has system component redundancy with no single point of failure. This means that end users never experience an interruption in server availability because downtime is pre-empted.

67% of best-in-class organisations use fault-tolerant servers to provide high availability to at least some of their most critical applications. Fault tolerance is achieved in a server by having a second set of completely redundant hardware components in the system architecture. The server’s software automatically synchronises the replicated components, executing all processing in lockstep so that “in flight” data is always protected.The two sets of CPUs, RAM, motherboards, and power supplies are all processing the same information at the same time

The two sets of CPUs, RAM, motherboards, and power supplies are all processing the same information at the same time. Therefore if one component fails, its companion component is already there and running, and the system keeps functioning.

Built-In Software Technology

Fault-tolerant servers also have built-in, fail-safe software technology that detects, isolates, and corrects system problems before they cause downtime. This means that the operating system, middleware, and application software are protected from errors. In-memory data is also constantly protected and maintained.

A fault-tolerant server is managed exactly like a standard server, making the system easy to install, use, and maintain. No software modifications or special configurations are necessary and the sophisticated back-end technology runs in the background, invisible to anyone administering the system.

In business environments where downtime needs to be minimised to the absolute minimum, ensuring you have fault tolerant systems will provide you with peace of mind that crucial data is not lost.

This information was provided by Duncan Cooke.