Volume Replication for High Availability in Kubernetes or OpenShift

24/09/2018
Paul Sobey
demo

While modern hardware is reliable, nodes can and do fail. Whether it is due to catastrophic hardware failure, an OS crash or a communication failure between cluster nodes, users with a high availability requirements need volume replication. Ondat solves this problem for containerized environments using Kubernetes or OpenShift.

Replication for High Availability

Replication is the process by which one or more volumes can be kept in sync with a single master volume. High availability refers to the ability to switch between the master and replicas at will, so if the master is suddenly unavailable (for whatever reason), a replica can be promoted to master. This is essential for any organization wanting to run stateful applications in containers. Without it, the business risks data loss or downtime.

Feature Spotlight: StorageOS Volume Replication

With replication disabled, a StorageOS volume saves data to a single node in a cluster. When a node fails, access to the StorageOS volume is suspended for the duration of the node failure, thus causing outage for the application using the volume.

When enabled, under node failure condition, StorageOS volume replication will transparently promote a replica node to master. Mount endpoints migrate to the new master, and applications continue without requiring maintenance or downtime. From the perspective of the application, the only visible effect is a small pause in IO while the failover takes place.

This allows applications backed by StorageOS volumes to be turned into HA applications without extra development work or application refactoring.

How Ondat Replication Works

All StorageOS volumes have a master replica to which writes are persisted. When additional replicas are enabled, slave volumes are created on other nodes in the cluster. As applications issue write requests, those writes are persisted to the master and all replicas synchronously, after which Ondat (via the operating system) acknowledges the write to the application.

Volume Replication for HA

Replication traffic travels between nodes using standard TCP/IP and is compressed using the lz4 algorithm.

The number of replicas is tunable on a per-volume basis between 0-5, depending on customer requirements, by setting a StorageOS label (semantically equivalent to Kubernetes labels). Additionally, Ondat policies allow the tuning of replication defaults for an entire namespace if preferred.

The Ondat architecture allows us to move the backing store for a volume ‘under the covers’ while maintaining the same client facing devices/nodes. In the case of node failure, Ondat promotes a replica to be master and no change is observable from the container, other than the aforementioned short delay.