OKE for Stateful Workloads

Published Oct 1, 2025 · 10 min readBy Fredrik FilipssonIndependent OCI services

Kubernetes was designed first for stateless workloads, the kind you can kill and reschedule anywhere without consequence. Stateful workloads break that assumption. A database, a message broker, or a cache holds data that must survive a pod restart and stay attached to the right replica. Teams often ask whether such workloads belong on OCI Kubernetes Engine at all. The honest answer is that they can run very well on OKE, but only if you understand how state is handled and where the sharp edges are. This article walks through the model, the trade offs, and a way to decide.

It is part of our OKE and containers series and builds on persistent storage on OKE and OKE cluster architecture.

What makes a workload stateful

A stateful workload is one whose identity and data matter across restarts. A web server is stateless because any instance can serve any request, so the scheduler is free to move it anywhere. A database replica is stateful because it owns a specific slice of data, expects a stable network name, and must reattach to the same storage volume when it restarts. The whole difficulty of running these workloads on Kubernetes comes from preserving that identity and that storage binding while the platform underneath is constantly free to reschedule pods.

Stateless pods are interchangeable. Stateful pods are not, and the platform has to be told how to preserve their identity and their data.

How Kubernetes handles state

Kubernetes provides two building blocks for stateful workloads. The StatefulSet gives each pod a stable, predictable name and a stable storage claim, so pod zero always comes back as pod zero with its own volume. The PersistentVolume and PersistentVolumeClaim pair connects a pod to durable block storage that outlives the pod itself. On OKE that storage is backed by OCI Block Volume through the container storage interface driver, which provisions and attaches volumes automatically as claims are created. Together these mean a database pod can be rescheduled to a new node and still find its data exactly where it left it.

OKE specifics for stateful workloads

Several OCI details shape how stateful workloads behave on OKE. Block Volumes are tied to an availability domain, so a pod that needs its volume must be scheduled in the same domain the volume lives in, which constrains failover across domains. The block storage performance tiers let you match volume throughput to a demanding database rather than paying for capacity you will not use. And node pool design matters, because draining a node during an upgrade forces stateful pods to detach and reattach their volumes, so you want that to happen in a controlled, one at a time fashion rather than all at once.

Concern	Stateless workload	Stateful workload on OKE
Pod identity	Interchangeable	Stable name via StatefulSet
Storage	None or ephemeral	Block Volume via PersistentVolumeClaim
Rescheduling	Move anywhere freely	Constrained to the volume availability domain
Upgrades	Roll freely	Drain carefully, one replica at a time
Backup	Rebuild from image	Volume snapshots plus application backups

When OKE is a good home for state

OKE suits stateful workloads when the application was built for Kubernetes operation, for example a modern distributed database or message broker that ships an operator to handle clustering, failover, and backups. In that case the operator does the hard work and OKE simply provides scheduling and storage. OKE is also a good fit when you already run the rest of an application on the cluster and want the data tier to share the same network, identity, and observability rather than living in a separate silo.

When a managed service is the better choice

For a traditional Oracle Database, OKE is usually the wrong place. A managed service such as Autonomous Database or a database system on OCI handles patching, backups, and high availability for you, and trying to reproduce that inside Kubernetes adds risk without adding value. The general rule is that if OCI offers a managed service that matches your workload, use it for the data tier and keep OKE for the application tier. Reserve self managed state on OKE for workloads that genuinely benefit from running beside the rest of the cluster.

Failure modes to plan for

Stateful workloads on OKE introduce failure modes that stateless ones do not. A node failure leaves a volume needing to detach and reattach elsewhere, which takes time and can stall a pod if the new node is in the wrong availability domain. A storage class misconfiguration can silently bind pods to volumes that do not have the throughput the database needs. And an unplanned mass drain during an upgrade can take several replicas down at once. Each of these is manageable, but only if you have designed for it rather than discovering it during an incident, which is why troubleshooting OKE clusters is worth reading alongside this.

A decision framework for stateful workloads on OKE

Check for a managed service first. If OCI manages this data tier well, prefer it over self managed state.
Confirm the app is Kubernetes native, ideally with an operator that handles clustering and failover.
Design storage deliberately, choosing the right Block Volume performance tier and availability domain layout.
Use StatefulSets, never plain Deployments, for anything that owns durable data.
Plan upgrades and drains so replicas move one at a time, never all at once.
Back up the data, not just the volume, combining snapshots with application level backups.

Bringing it together

OKE can run stateful workloads well, but it asks more of you than stateless ones do. StatefulSets and Block Volume backed claims preserve identity and data, while careful node pool, availability domain, and upgrade design keep the workload healthy. For traditional databases a managed OCI service is usually the safer path, with OKE reserved for Kubernetes native data systems and the application tier. Continue with persistent storage on OKE, OKE cluster architecture and troubleshooting OKE clusters. The OKE solution practice designs stateful platforms on OKE on a fixed project fee.

Free white paper

Go deeper on this topic with The OCI Landing Zone and Architecture Guide, a reference architecture for security, networking, and governance on OCI. An independent analyst style report with comparison tables and recommendations, free with a work email. Prefer a monthly summary instead? The OCI Brief delivers one practical OCI briefing a month.

Part of a series
This guide is part of Kubernetes & DevOps on OCI — our complete pillar guide on the topic.

About the author

Fredrik Filipsson, Co-founder of OCI Specialists — 20 years of enterprise IT experience in Oracle Database, OCI cost optimization, licensing, and data platforms. Full profile · LinkedIn

Moving Oracle workloads to OCI, or already running on OCI and not sure the architecture or the spend is right? Most teams bring in a specialist before they commit to a region, a shape, or a Universal Credits number. OCISpecialists.com plans the landing zone, runs the migration, and manages the estate after go live, on a fixed project fee, a managed monthly retainer, or a cost optimization fee paid only on verified savings.