It’s that time of the month again. Ondat Version 2.8, the Elusive Octopus, has arrived! This Elusive Octopus brings you the features to make ETCD in production easier to run, adds the ability to snapshot your data, and gives more insight into how your data is being managed.
Backup your data with snapshots
Gain further confidence in the safety of your data using our new snapshot capability to provide a full snapshot of your data on demand. These snapshots can be used with other services, like Kasten, to create S3 backups of your data outside of your cluster for audit, disaster recovery, or regulatory purposes.
You can easily use these backups to restore your application to a known state, create ad-hoc clones of your application at a chosen point in time, or to keep a backup application instance up to date with the newest data to help make any failover as non-disruptive as possible.
Reduce maintenance and external service costs by running ETCD within your cluster
v2.8 brings significant changes that open up the option of running a robust ETCD setup within your production cluster. By running ETCD within your cluster, the need for an external service's setup, operational overhead, and cost is removed.
ETCD uptime is crucial, therefore to minimize any potential disruption we have:
- made changes to the failover detection path which can lead to significantly reduced failover detection times
- included a pod disruption budget to keep the number of instances at or above the level required for a quorum
Observability is critical when running crucial systems, so ETCD within the cluster will create a ServiceMonitor, which can be used to monitor all ETCD instances. We have created an example Grafana dashboard to help our users get started.
Stability, defragmentation, and compaction improvements are all included to help QoL and reduce friction in the running of ETCD.
Prometheus integration providing logs and metrics for observability
Data safety is the top priority for us and our users, so we’ve been working to help improve how this safety can be monitored. v2.8 brings a Prometheus endpoint where Ondat will provide information around the disks and filesystems underpinning each volume on every node. The report includes key metrics that can trigger alerts should anything occur that demands attention and get ahead of any potential problems.
We’re big fans of storage and love getting stuck into data to ensure everything is optimized; the metrics provided will help you analyze your cluster's utilization and performance and drive improvements.
Dynamic systems need a solution to ensure that there are no unilluminated corners where issues can lurk. Our solution means that your Prometheus instance needs to only communicate with a single endpoint where data for all Ondat volumes will be collated and presented.
We have created a Grafana dashboard summarizing the key metrics to provide our users with a quick reassurance that volumes are behaving as expected.
With its 2.8 release, Ondat is further enhanced and matured to support stateful applications. But the journey doesn’t stop here and there is more to come!