OKE Cluster Architecture

Published Aug 28, 2025 · Updated May 26, 2026 · 10 min readBy Fredrik FilipssonIndependent OCI services

Most OKE problems in production trace back to architecture decisions made early and never revisited. A cluster that was stood up quickly to prove a point becomes the cluster that runs the business, with a single node pool, no spread across failure boundaries and a public API endpoint nobody meant to leave open. Architecting an OKE cluster deliberately is not complicated, but it does require thinking about failure, growth and isolation before the first workload lands. This article lays out how to design an OKE cluster that holds up.

It sits within our OKE and containers series and assumes you have already stood up a cluster following getting started with OKE.

The two halves of a cluster

Every OKE cluster has a managed control plane that Oracle operates and worker capacity that you operate. Architecture is mostly about the worker side and about how the cluster connects to the rest of your network, because the control plane availability is Oracle's responsibility on enhanced clusters with a financially backed SLA. Your job is to make sure that when something fails on the worker side or in a single location, the cluster keeps serving.

Designing for failure boundaries

OCI gives you two failure boundaries to design against. Availability domains are physically separate data centres within a region, and fault domains are isolated groups of hardware within a single availability domain. Some regions have multiple availability domains and some have one. The architecture goal is simple: never let a single failure boundary take your cluster down.

Failure boundary	What it protects against	How to use it
Fault domain	Hardware, rack or power failure within a data centre	Spread node pool nodes across all fault domains
Availability domain	Loss of an entire data centre	Spread nodes across availability domains where the region has more than one
Region	Loss of an entire region	Run a second cluster in another region for disaster recovery

In a multi availability domain region, spread your worker nodes across availability domains and let OKE place them across fault domains within each. In a single availability domain region, you cannot protect against the loss of that data centre within one cluster, so fault domain spread and a disaster recovery region carry more weight.

A cluster that runs all its nodes in one fault domain is one rack failure away from an outage. Spread is the cheapest resilience you will ever buy.

Node pool strategy

A node pool is a group of identical worker nodes. The instinct is to run one big pool for everything, but separate pools matched to workload needs is almost always better. Stateless web services, batch jobs, GPU accelerated machine learning and memory heavy applications all have different shape requirements, and mixing them in one pool means compromising on shape for everything. Separate pools let you size each for its workload and scale them independently.

A common production pattern is a general pool of balanced shapes for stateless services, a separate pool of larger or specialised shapes for heavier workloads, and virtual nodes for bursty or unpredictable work. The mix of managed node pools and virtual nodes is covered in OKE virtual nodes explained, and sizing pools for cost is in OKE cost optimization.

The control plane endpoint

The Kubernetes API endpoint is how everyone and everything talks to the cluster, and where it lives is a security decision with architectural consequences. A public endpoint is reachable from the internet, which is convenient and dangerous. A private endpoint is reachable only from within your network, which is safer and the right default for production. If you need to reach a private endpoint from outside, you do it through a bastion or a VPN rather than exposing it. This choice ties into the broader security model in OKE security best practices.

Network design

An OKE cluster lives inside a virtual cloud network, and the subnet layout matters. You separate the subnets for the API endpoint, the worker nodes and the load balancers, and you size the pod subnet generously if you use VCN native pod networking, because every pod takes an IP from it. Running out of pod IPs is a painful, avoidable failure. Network security lists and security groups then control what can talk to what. The networking model in full is in OKE networking explained.

A reference production topology

Enhanced cluster for the financially backed control plane SLA and access to virtual nodes.
Private API endpoint reachable through a bastion or VPN, never open to the internet.
Worker nodes spread across all availability domains in the region and across fault domains within each.
Multiple node pools matched to workload profiles, scaled independently.
Virtual nodes for bursty and unpredictable workloads to avoid paying for idle capacity.
VCN native pod networking with a generously sized pod subnet and network policy enabled.
Separate subnets for endpoint, workers and load balancers, with security groups controlling traffic.
A second region cluster for disaster recovery where the workload justifies it.

Sizing for growth

Architecture should anticipate growth without over building on day one. The autoscaler handles node count growth within a pool, so you do not need to pre provision capacity, but you do need headroom in the underlying limits: enough IP addresses in the pod subnet, enough service limits in the tenancy, and node shapes that can scale to the load you expect. Designing the network and limits for the cluster you will have in a year, while running the capacity you need today, is the balance to strike. Scaling mechanics are in autoscaling OKE workloads.

Multiple clusters versus one big cluster

A recurring architecture question is whether to run one large shared cluster or several smaller ones. Several smaller clusters give stronger isolation between environments and teams, blast radius containment and independent upgrade schedules, at the cost of more clusters to operate. One large cluster is cheaper to run and simpler to see, but a problem in it affects everyone. Most mature estates separate at least production from non production into different clusters, and often separate by team or by sensitivity beyond that. The right answer depends on your isolation needs and your operational capacity.

Bringing it together

A well architected OKE cluster is one where no single failure boundary causes an outage, where workloads run on capacity sized for their needs, where the API endpoint is not exposed, and where the network has room to grow. None of that is exotic, but all of it is far easier to build in at the start than to retrofit. Once the architecture is set, the operational disciplines of scaling, security, delivery and observability sit on top of it cleanly. Continue with OKE networking explained and OKE security best practices.

The OKE solution practice designs cluster architectures to this reference and builds them on a fixed project fee, with managed operations available afterward.

Free white paper

Go deeper on this topic with The OCI Landing Zone and Architecture Guide, a reference architecture for security, networking, and governance on OCI. An independent analyst style report with comparison tables and recommendations, free with a work email. Prefer a monthly summary instead? The OCI Brief delivers one practical OCI briefing a month.

Part of a series
This guide is part of Kubernetes & DevOps on OCI — our complete pillar guide on the topic.

About the author

Fredrik Filipsson, Co-founder of OCI Specialists — 20 years of enterprise IT experience in Oracle Database, OCI cost optimization, licensing, and data platforms. Full profile · LinkedIn

Moving Oracle workloads to OCI, or already running on OCI and not sure the architecture or the spend is right? Most teams bring in a specialist before they commit to a region, a shape, or a Universal Credits number. OCISpecialists.com plans the landing zone, runs the migration, and manages the estate after go live, on a fixed project fee, a managed monthly retainer, or a cost optimization fee paid only on verified savings.