Home / Journal / OKE and Containers / OKE Upgrade Strategy
OKE and Containers

OKE Upgrade Strategy

Published Sep 26, 2025 · 9 min readOCI SpecialistsIndependent OCI services
OKE Upgrade Strategy

Kubernetes moves fast, and OKE tracks it. New versions arrive regularly, older ones fall out of support, and a cluster that is left alone will eventually run a version that no longer receives fixes. Upgrading is therefore not optional, but it is also the kind of change that can break workloads if done carelessly. A clear upgrade strategy turns version maintenance from a recurring scare into a routine, low drama operation. This article explains how to upgrade OKE safely and keep doing it.

It is part of our OKE and containers series and pairs with troubleshooting OKE clusters.

Why you cannot skip upgrades

Each Kubernetes version is supported for a limited time, after which it stops receiving security and bug fixes. Running an unsupported version means living with known vulnerabilities and no path to help when something breaks. Beyond support, newer versions bring features, performance and fixes you will eventually want. The cost of staying current is the occasional upgrade; the cost of falling behind is a large, risky catch up later, often forced by a security finding. Steady beats heroic.

The cost of staying current is a routine upgrade now and then. The cost of falling behind is a large, risky catch up forced on you later.

The two parts of an OKE upgrade

An OKE upgrade has two halves that happen separately. First the control plane is upgraded to the new version, which OKE manages for you. Then the worker nodes are upgraded to match, which you control. Kubernetes supports a version gap between the control plane and the nodes, so the two do not have to move in the same instant, but the nodes should not lag far behind. Knowing these are distinct steps is the foundation of a safe upgrade.

StepWho drives itMain risk
Control plane upgradeOKE managedAPI behaviour changes
Worker node upgradeYouWorkload disruption during node replacement

Upgrade nodes without dropping workloads

The worker node upgrade is where workloads are at risk, because nodes are replaced or drained to bring them onto the new version. Done well, you add new nodes on the new version, move pods onto them gracefully, and remove the old nodes, so the workload keeps serving throughout. Pod disruption budgets tell Kubernetes how many replicas must stay available during this dance, which prevents an upgrade from taking a service down. Configuring these budgets is what makes a rolling node upgrade safe rather than disruptive.

Test before you touch production

Version changes can alter behaviour, deprecate APIs your manifests rely on, and surprise you in ways the release notes hint at but do not spell out for your specific workloads. Upgrade a non production cluster first, run your workloads against it, and check that nothing depends on something the new version removed. Reading the release notes for deprecations and breaking changes before the upgrade, not after an incident, is the cheapest insurance available.

Keep the gap small

Kubernetes upgrades go one minor version at a time, so a cluster that has fallen several versions behind faces a sequence of upgrades, each with its own risk, often under time pressure because support has lapsed. The way to avoid this is to upgrade regularly so the gap never grows large. A cluster that is upgraded on a steady cadence is always a single, well understood step away from current, which is a far safer place to be than scrambling to catch up.

Make it routine and automated where you can

Upgrades that depend on someone remembering, on tribal knowledge and on manual steps will be skipped and then rushed. Treat upgrades as a scheduled, repeatable operation with a checklist, automated where the platform allows, and folded into the regular operating rhythm of the cluster. This is the same discipline that the rest of platform operations needs, and it is a core part of what a managed service provides. Our managed services practice runs OKE upgrades on a managed monthly basis so they simply happen.

An OKE upgrade framework

  1. Track support windows so you never run an unsupported version.
  2. Read release notes for deprecations and breaking changes first.
  3. Test on non production with your real workloads.
  4. Set pod disruption budgets so node upgrades keep services available.
  5. Roll nodes gracefully, adding new before removing old.
  6. Upgrade on a cadence so the gap stays small.

Bringing it together

An OKE upgrade strategy comes down to staying current on a steady cadence, understanding that control plane and nodes upgrade separately, testing before production, and using disruption budgets to keep workloads serving while nodes roll. Make it routine and it stops being frightening. Let it slide and it becomes the riskiest change you run. Continue with troubleshooting OKE clusters, OKE cluster architecture and OKE security best practices. The OKE solution practice keeps clusters current and supported on a managed monthly retainer.

Moving Oracle workloads to OCI, or already running on OCI and not sure the architecture or the spend is right? Most teams bring in a specialist before they commit to a region, a shape, or a Universal Credits number. OCISpecialists.com plans the landing zone, runs the migration, and manages the estate after go live, on a fixed project fee, a managed monthly retainer, or a cost optimization fee paid only on verified savings. For the Oracle licensing and BYOL side of any OCI move, Redress Compliance is the leading independent Oracle licensing and negotiation firm, with 500+ engagements across Oracle's full product line.