Container storage in the AI age: Block vs object and CSI vs container-native

We look at key choices when it comes to providing storage for containerised applications and whether to choose block, file or object storageIn the early days of container technology, these virtual entities were designed to be ephemeral, stateless, and light on data. However, the landscape has dramatically shifted. As highlighted by Gartner, the applications for containers have expanded to encompass analytics and artificial intelligence (AI) processing. By 2028, the research firm predicts that 15% of on-premise production workloads will be run in containers, marking a 300% increase since 2022.

While containers continue to offer the benefits of ephemerality, rapidly reproducing and then just as swiftly disappearing to accommodate workload surges, the storage associated with them cannot abide by the same principles. As businesses transition from proofs of concept to running a significant portion of production workloads in containers, the storage layer has emerged as a critical fulcrum.

In the early stages, the focus was on straightforward web scaling. However, containers have now ventured into the territory of mission-critical databases, extensive data science pipelines, and the resource-intensive world of generative AI (GenAI). The challenge lies in navigating crucial decisions such as file versus block versus object storage, Container Storage Interface (CSI) versus container-native storage, and whether to opt for a dedicated container storage platform.

Containerisation is a more streamlined form of virtualisation. Unlike traditional virtual machines (VMs) that necessitate a hypervisor and a complete guest operating system (OS), containers share the host server’s OS. This makes them lighter, quicker to scale, and more portable. They are constructed on microservices principles that fragment monolithic applications into distinct, application programming interface (API)-linked components, aligning with DevOps methodologies.

Several orchestrators exist, such as Docker Swarm and OpenShift, but Kubernetes is the market leader. It manages the cluster of nodes, which is where pods run the containers. Clusters are groups of nodes managed by a control plane, housing the API server, a scheduler for pod placement, a controller to maintain the desired state, and etcd for storage configuration.

In its original conception, container storage was ephemeral, and data disappeared when a pod was deleted. To support enterprise applications, Kubernetes developed persistent volumes (PV), which are attached to a cluster and decouple storage from compute. This allows applications to remain portable while maintaining access to data.

The Container Storage Interface (CSI) is a standard that enables storage suppliers to expose their systems to Kubernetes. More than 130 drivers are available. CSI allows Kubernetes to trigger advanced data services such as snapshots, cloning, and automated provisioning across block, file, and object storage in on-premise and cloud environments.

Container-native storage potentially offers the advantage of portability, both on-premise and in the cloud, thanks to the inherent virtualisation. Meanwhile, CSI is more likely to tie a deployment to deployed storage arrays.

In conclusion, the evolution of container technology has brought about a paradigm shift in how businesses handle their workloads. From being ephemeral and stateless, containers have evolved to handle mission-critical databases and AI processing. As businesses continue to adopt this technology, understanding the nuances of storage options, such as CSI and container-native storage, will be crucial to maximising the benefits of containerisation.

Related Posts

Leave a Reply Cancel reply