Removing the Storage Bottleneck in Cloud-Native Computing


There is a longstanding, general trend in the IT industry, where compute iterates incredibly quickly, networking a little more slowly, while generational advances in storage have tended to be exceptionally slow. This is because data and applications have gravity, and there is a dependency on the storage layer to be consistent, robust, and reliable, so there is inherent inertia towards changing the storage components. And in a highly innovative arena like Cloud Native development - this can turn storage into a severe bottleneck to both progress and performance. Not least, for modern DevOps teams looking to run stateful applications in Kubernetes.

Even beyond the storage industry’s natural tendency towards conservatism and slow, steady progress, storage presents tough challenges for Cloud Native development. In the Cloud Native space, you want everything to be declarative, you want everything to be portable, and you want the orchestrator to be able to move things around and reshuffle your workloads. The minute you introduce storage into the equation, things aren’t portable. You break the model: applications and containers become locked into specific nodes. Without a Cloud-Native data solution that makes the data portable, the storage becomes a significant inhibitor.

I first encountered these problems developing business application infrastructure in the financial sector in my previous life. In the early 2010s, although there were some aspects of config management and other tooling available - it was a long way off the ideals of composable and declarative environments found in today’s cloud native infrastructure.  As a result, servers or nodes were often irreplaceable and hard to build - “pets” in today’s “pets vs cattle” analogy.  This made it incredibly difficult to deal with everything around compliance, governance, audit, data retention, and disaster recovery.

We wanted to build automation for the developers to define the infrastructure needed to develop their applications. This enabled developer self-service, giving them ease of use and quicker time to market for their applications. But the more important point is that we needed re-creatable infrastructure environments. 

The idea was to say to developers: you define what the application needs, the automation goes and makes that happen. Then in a disaster scenario, if something breaks, we can instantly recreate the infrastructure environment somewhere else, based on the initial definitions. This will sound instantly familiar to anyone using modern infrastructure-as-code and DevOps techniques alongside Kubernetes or other cloud-native environments. But at the time, this was a huge challenge. And storage made this challenge all the more complicated.

Storage was made up of big black boxes, islands of storage locked into specific data centers. When you wanted to deploy in datacenter A, you only had capacity in datacenter B. Storage systems didn’t have APIs for automation and were non-standard. In some ways, you were thankful for the storage industry’s slow pace of change because each new generation of storage array came with a completely new framework for control and integration.

After spending a number of years trying to retrofit cloud native functionality into non-cloud-native technology, eventually, we concluded we needed to build the solution ourselves. And so, the seed of Ondat was sown.

We set about creating an agile, API-driven software data mesh that could plug into these conservative storage environments. What has changed over the years is that the first containers came along, making applications more portable and easy to orchestrate. Then came the orchestrator and Kubernetes became the default. So the system we were aspiring to build in these financial service environments was becoming a more general-purpose reality across the industry and the ecosystem. 

The CNCF has played an enormous role in this more recent change, and in my next blog, I will take a look at some of the activities of the CNCF Storage TAG, along with some key developments and resources we created to evaluate and tackle the challenges of Cloud Native storage. 

Alex Chircop, CEO, Ondat

written by:
Alex Chircop
Alex is a founder and CTO of Ondat (formerly StorageOS), building software-defined solutions for cloud-native environments. Alex is also a co-chair of the CNCF Storage TAG (previously SIG). Before embarking on the startup adventure he spent over 25 years engineering infrastructure platforms for companies like Nomura and Goldman Sachs.