A high availability solution consists of several different components:
When designing a high availability solution, it should generallxy be remembered that even the installation of all key servers at a single location can be a potential SPOF if this location is hit by disaster or power failures. The environmental conditions of the servers should also be taken into account — (redundant) air conditioning systems are essential.
Even the most sophisticated software cannot produce a high availability system without the greatest possible security from failure on a hardware level. The key hardware components that should be considered and laid out with the greatest possible redundancy are:
If possible, secure your servers using a UPS (uninterruptible power supply) to ensure that a brief power failure can be bridged and the systems can be shut down correctly in the event of a longer power failure. The power supply should also be configured for redundancy.
Make sure each of your systems has several network interfaces. If one interface fails, another must automatically take over the address and task of the failed component. Redundancy expressly relates to the two interface directions. There is no harm planning an active and backup interface for both the internal and external interfaces.
Assign several hard disks to your system and arrange the data backup (e.g., using RAID or drbd) in such a way that if one of these disks is lost, the others always contain the intact data record. It must be possible to replace a faulty disk with a new one without stopping the system.
All important data and applications that form the outer face of your systems must be arranged in such a way that they will not prevent a restart. If an application does not release its lock files after a crash, this prevents the relevant process from restarting. This means that the application is not suitable for a high availability environment. Ideally, the “health” of certain applications, operating system processes, and network connections should be monitored with a suitable monitoring tool.
After a system fails, all key data must be available to the failover system complete and intact. This type of high availability is achieved by distributing stored data over several systems or hard disks. For this, the contents of a disk are regularly mirrored on another disk (or several disks), which can take over with the intact data record if a failure occurs. Use a journaling file system to ensure that a file system restarts in a consistent state after a system crash.
All network infrastructure should be configured for redundancy, from the router and switch infrastructure down to the simple network cable.