Kubernetes is fast becoming an industry standard, with up to 94% of organizations deploying their services and applications on the container orchestration platform, according to a survey. One of the main reasons companies use Kubernetes is standardization, which allows advanced users to see productivity gains of up to twofold.
Standardization on Kubernetes gives organizations the ability to deploy any workload anywhere. But there was one piece missing: the technology assumed workloads were ephemeral, meaning only stateless workloads could be safely deployed on Kubernetes. However, the community recently changed the paradigm and brought features like StatefulSets and Storage Classes, which enable the use of data on Kubernetes.
While running stateful workloads on Kubernetes is possible, it is still challenging. In this article, I’ll provide ways to make it happen and why it’s worth it.
Do it gradually
Kubernetes is fast becoming as popular as Linux and the de facto way to distribute any application anywhere. Using Kubernetes involves learning many technical concepts and vocabulary. For example, newcomers may struggle with Kubernetes’ many logical units, such as containers, pods, nodes, and clusters.
If you’re not already using Kubernetes in production, don’t jump right into data workloads. Instead, start moving stateless applications to avoid losing data if things go wrong.
If you can’t find an operator that fits your needs, don’t worry, most of them are open-source.
Understand the limitations and specifics
Once you’re familiar with general Kubernetes concepts, dive into the details for stateful concepts. For example, because applications can have different storage needs, such as performance or capacity requirements, you must specify the correct underlying storage system.
What the industry generally calls Storage Profiles is called Storage Classes in Kubernetes. They provide a way to describe the different types of classes that a Kubernetes cluster can access. Storage classes can have different quality of service levels, such as I/O operations per second per GiB, backup policies, or arbitrary policies such as binding modes and allowed topologies.
Another crucial part to understand is StatefulSet. It is the Kubernetes API object used to manage stateful applications and provides key features such as:
- Stable, unique network IDs that allow you to track volume and disconnect and reconnect as needed;
- Stable, persistent storage so your data is safe;
- Orderly, graceful deployment and scalability required for many Day 2 operations.
While StatefulSet has been a successful replacement for the infamous PetSet (now obsolete), it is still imperfect and has limitations. For example, the StatefulSet controller has no built-in support for changing the volume (PVC). — which is a major challenge when the size of your application dataset is about to grow beyond its current allocated storage capacity. There are workaroundsbut such limitations must be understood well in advance so that the technical team knows how to deal with them.