How To Build a CI/CD Pipeline for Kubernetes Stateful Applications

4/02/2022
demo

This series of posts will walk you through a hands-on experience focused on building a simple (but cool) stateful application that has been designed to run on Kubernetes. Part 1 sets the stage.

Provisioning stateful applications on-demand in Kubernetes can be challenging. Stateful apps are built on databases, message buses, or similar systems that persist data to disk. If you think, “Well, it’s almost the case for every application,” then you’re right, but the gotcha is that generally, the backend doesn’t live in Kubernetes.

In addition, their architecture is generally spread across heterogeneous platforms. Typically, the frontend component already sits in Kubernetes and can be provisioned, scaled up, and down on-demand as it is usually stateless. At the same time, data sets can be hosted in various formats and form-factors: Virtual Machine on-premises, PaaS services in the Public Cloud, Big Data platform, or even physical machines.

This series explores the DevOps experience when developing stateful applications designed to run on Kubernetes and building on-demand data services. Why run all components in Kubernetes, you may ask. The goal is for application owners to accelerate the release of business-critical applications, taking advantage of native Kubernetes benefits.

We’re going to focus on the toolset that enables developers to navigate through the development lifecycle of these applications efficiently. Namely, we’re going to delve into:

  • Kustomize
  • Skaffold
  • Ondat Persistent Volumes
  • MongoDB Database and Replica Set
  • MongoDB Community Kubernetes Operator
  • Pymongo

If this is not the most DevOps setup for a stateful app, I don’t know what DevOps means anymore 😆.

Meet the Marvel App

We are going to build an app that displays Marvel characters cards selected randomly to demonstrate Kubernetes capabilities. Our goal is also to explore how DevOps tools fit in the application development process. Hopefully, this will be a lot more fun than yet another “Hello World” application! The app architecture is represented below:

 

The code is available here: https://github.com/vfiftyfive/FlaskMarvelApp.

The FE (Frontend) component is a Python Flask application. Its role is to provide a visualization layer for data ingested into the MongoDB database. It runs in Kubernetes as a Deployment.

The BE (Backend) is a MongoDB database deployed as a 3-node cluster and managed by the MongoDB Community Kubernetes Operator. It runs in Kubernetes as of a StatefulSet, which is standard for an application that needs to persist data to disk.

As shown in the picture, a StatefulSet leverages a Kubernetes controller ensuring that each Pod has access to its own datastore by claiming persistent volumes (PV) tied to individual PVCs. For more information on StatefulSets and their use cases, you can take a look here on the Ondat website or in the official Kubernetes documentation.

As part of the application deployment, a Kubernetes Job is also provisioned. Its role is to fetch Marvel characters’ information from the Marvel API available here and store it in the MongoDB database. The Job is only run once, at application deployment time. Kubernetes will schedule Pods until the job is successfully completed (within a limited number of retries). Pods may fail, waiting for the connection to the database to be established, but once succeeded, it is never rerun. The FE can then display random characters cards directly retrieved from the database.

The application comprises the FE +BE and a data initialization step that populates the BE with relevant information. It can be deployed in Kubernetes using YAML manifests dynamically generated by a command-line tool or as a CI/CD pipeline. Kustomize is our tool of choice in this article for generating manifests. It can also update objects related to a particular build iteration. With Kustomize, you can easily update container image references, or for our use case, deploy the application with a new version of Mongo, or provision a new Ondat StorageClass.

Templatize and Automate All the Things!

It all begins with the ability to iterate application testing and deployment in various development stages, such as locally on a laptop, on a remote testing cluster, or during staging, User Acceptance Testing (UAT), and production phases. Kustomize can be used as a  kubectl option (-k) to apply customization when generating application manifests. This tool allows developers to dynamically adapt application requirements and context to specific environments. However, we generally recommend using Kustomize as a separate binary since it is more flexible and always up to date.

The principle of Kustomize is to build an overlay by specifying the elements from the base manifests you want to modify. In this article, we focus on the application development phase, where we code and test the application within a local Kubernetes K3s cluster. In that use case, Kustomize needs 2 folders, one for the base manifests and one for the Dev overlay:

base\
job.yaml
kustomization.yaml
marvel_deploy.yaml
marvel_svc.yaml
mongodbcommunity_cr.yaml
ondat_sc.yaml
overlay\
dev\
job.yaml
kustomization.yaml
marvel_deploy.yaml
mongodbcommunity_cr.yaml
name_reference.yaml
ondat_sc.yaml

The content of these files can be accessed here. One can simply run Kustomize with the build option to generate the appropriate Kubernetes manifests. For example, to modify the base manifests with the dev overlay, you can run  kustomize build overlay/dev, assuming you’re in the parent folder of the “overlay” directory. The output is a set of manifests directly displayed on your terminal, so if you want to save the result as YAML, just redirect the output to a file. Another option is to use the output of kustomize build as input for  kubectl apply like this:

kustomize build overlay/dev | kubectl apply -f -

It will directly deploy the objects into your Kubernetes cluster.

Aligning the Moving Parts

When developing an application designed to run on Kubernetes, you will face repetitive tasks very early. They include building new container images as you commit new code, updating the Kubernetes manifests, and deploying a new version of your application stack into the Kubernetes test cluster so you can execute your tests.

Our goal is to provide a pipeline that automatically builds the container images, updates the manifests accordingly, and continuously deploys the application stack as soon as we modify the source code. The workflow looks like this:

Skaffold is one of the tools that allows you to do just that. It is an open-source project by Google that provides a CLI to manage the lifecycle of your application within various stages of the CI/CD pipeline. It can help with your app’s development, build, and deployment phases. In our use case, we’re interested in the early dev stage. The role of Skaffold will be to build a new image from the FE Dockerfile every time the code is saved locally before performing. git commit and deploying it to the dev Kubernetes cluster using Kustomize. As a result, you don’t need to commit or push code to your git repository to test it.

Let’s start with the content of the FE Dockerfile.

FROM python:3.9

WORKDIR /code

COPY ./requirements.txt /code/requirements.txt

RUN pip install --no-cache-dir --upgrade -r /code/requirements.txt

COPY ./app /code/app

CMD ["gunicorn", "--conf", "app/gunicorn_conf.py", "--bind", "0.0.0.0:80", "app.main:app"]

Here we perform very standard operations:

  • Install the dependencies required by the Flask application
  • Copy the source code to the container
  • Run the web server using  gunicorn WSGI (as the app doesn’t server any static HTML pages, there’s no need for nginxor another HTTP server).

The Skaffold dev mode allows you to detect any change happening in real-time in the application source code, automatically build a new container image using that Dockerfile, and deploy it to the dev Kubernetes cluster. There’s no need to perform  git commit or  git push to trigger this process via a webhook. In that mode, the Skaffold binary is running as a daemon that detects code changes. Skaffold can deploy the application components using different tools. We have chosen Kustomize, but Docker, kubectl, and Helm are also available options. Similarly, the Skaffold build phase can leverage Dockerfiles, buildpacks, and other tools mentioned in the documentation, as well as custom scripts.

Since we’re using an ARM-based architecture for development, we need a custom script to perform Docker cross-platform builds. An example of such a script is given here. The build.sh script we use is located at the root of the Marvel app repository. It contains the same code as in the example. Skaffold utilizes this script to build the image artifact in the build phase. Then, Kustomize dynamically generates the Kubernetes manifests, and Skaffold deploys them to the cluster.

If your machine runs on an x86 processor, you don’t need any custom script or additional build commands. But don’t worry, we’ll dive into every component in detail!

Next time, we’ll get our hands dirty by going through every single step required to build this pipeline. We’ll also dig into the MongoDB Operator and explain why you need it!

 
written by:
Nic Vermandé
Nic is an experienced hands-on technologist, evangelist and product owner who has been working in the fields of Cloud-Native technologies, Open Source Software, Virtualization and Datacenter networking for the past 17 years. Passionate about enabling users and building cool tech solving real-life problems, you’ll often see him speaking at global tech conferences and online events, spreading the word and walking the walk with customers.

Register for our SaaS Platform

Learn how Ondat can help you scale persistent workloads on Kubernetes.