OKE Upgrade Strategy

Published Sep 26, 2025 · 9 min readBy Fredrik FilipssonIndependent OCI services

Kubernetes moves fast, and OKE tracks it. New versions arrive regularly, older ones fall out of support, and a cluster that is left alone will eventually run a version that no longer receives fixes. Upgrading is therefore not optional, but it is also the kind of change that can break workloads if done carelessly. A clear upgrade strategy turns version maintenance from a recurring scare into a routine, low drama operation. This article explains how to upgrade OKE safely and keep doing it.

It is part of our OKE and containers series and pairs with troubleshooting OKE clusters.

Why you cannot skip upgrades

Each Kubernetes version is supported for a limited time, after which it stops receiving security and bug fixes. Running an unsupported version means living with known vulnerabilities and no path to help when something breaks. Beyond support, newer versions bring features, performance and fixes you will eventually want. The cost of staying current is the occasional upgrade; the cost of falling behind is a large, risky catch up later, often forced by a security finding. Steady beats heroic.

The cost of staying current is a routine upgrade now and then. The cost of falling behind is a large, risky catch up forced on you later.

The two parts of an OKE upgrade

An OKE upgrade has two halves that happen separately. First the control plane is upgraded to the new version, which OKE manages for you. Then the worker nodes are upgraded to match, which you control. Kubernetes supports a version gap between the control plane and the nodes, so the two do not have to move in the same instant, but the nodes should not lag far behind. Knowing these are distinct steps is the foundation of a safe upgrade.

Step	Who drives it	Main risk
Control plane upgrade	OKE managed	API behaviour changes
Worker node upgrade	You	Workload disruption during node replacement

Upgrade nodes without dropping workloads

The worker node upgrade is where workloads are at risk, because nodes are replaced or drained to bring them onto the new version. Done well, you add new nodes on the new version, move pods onto them gracefully, and remove the old nodes, so the workload keeps serving throughout. Pod disruption budgets tell Kubernetes how many replicas must stay available during this dance, which prevents an upgrade from taking a service down. Configuring these budgets is what makes a rolling node upgrade safe rather than disruptive.

Test before you touch production

Version changes can alter behaviour, deprecate APIs your manifests rely on, and surprise you in ways the release notes hint at but do not spell out for your specific workloads. Upgrade a non production cluster first, run your workloads against it, and check that nothing depends on something the new version removed. Reading the release notes for deprecations and breaking changes before the upgrade, not after an incident, is the cheapest insurance available.

Keep the gap small

Kubernetes upgrades go one minor version at a time, so a cluster that has fallen several versions behind faces a sequence of upgrades, each with its own risk, often under time pressure because support has lapsed. The way to avoid this is to upgrade regularly so the gap never grows large. A cluster that is upgraded on a steady cadence is always a single, well understood step away from current, which is a far safer place to be than scrambling to catch up.

Make it routine and automated where you can

Upgrades that depend on someone remembering, on tribal knowledge and on manual steps will be skipped and then rushed. Treat upgrades as a scheduled, repeatable operation with a checklist, automated where the platform allows, and folded into the regular operating rhythm of the cluster. This is the same discipline that the rest of platform operations needs, and it is a core part of what a managed service provides. Our managed services practice runs OKE upgrades on a managed monthly basis so they simply happen.

An OKE upgrade framework

Track support windows so you never run an unsupported version.
Read release notes for deprecations and breaking changes first.
Test on non production with your real workloads.
Set pod disruption budgets so node upgrades keep services available.
Roll nodes gracefully, adding new before removing old.
Upgrade on a cadence so the gap stays small.

Bringing it together

An OKE upgrade strategy comes down to staying current on a steady cadence, understanding that control plane and nodes upgrade separately, testing before production, and using disruption budgets to keep workloads serving while nodes roll. Make it routine and it stops being frightening. Let it slide and it becomes the riskiest change you run. Continue with troubleshooting OKE clusters, OKE cluster architecture and OKE security best practices. The OKE solution practice keeps clusters current and supported on a managed monthly retainer.

Free white paper

Go deeper on this topic with The OCI Landing Zone and Architecture Guide, a reference architecture for security, networking, and governance on OCI. An independent analyst style report with comparison tables and recommendations, free with a work email. Prefer a monthly summary instead? The OCI Brief delivers one practical OCI briefing a month.

Part of a series
This guide is part of Kubernetes & DevOps on OCI — our complete pillar guide on the topic.

About the author

Fredrik Filipsson, Co-founder of OCI Specialists — 20 years of enterprise IT experience in Oracle Database, OCI cost optimization, licensing, and data platforms. Full profile · LinkedIn

Moving Oracle workloads to OCI, or already running on OCI and not sure the architecture or the spend is right? Most teams bring in a specialist before they commit to a region, a shape, or a Universal Credits number. OCISpecialists.com plans the landing zone, runs the migration, and manages the estate after go live, on a fixed project fee, a managed monthly retainer, or a cost optimization fee paid only on verified savings.