OCI DNS and Traffic Management: Zones, Steering Policies, and Health Checks
DNS is the quiet layer that decides where every request actually goes. On OCI it is more than a name resolver. Through Traffic Management steering policies, OCI DNS can route users to the nearest region, fail traffic away from an unhealthy endpoint, and balance load across multiple sites. Used well it is one of the most powerful tools for performance and resilience. Used poorly, a forgotten time to live value can keep sending users to a dead region long after everything else has recovered. This guide explains how OCI DNS and Traffic Management work and how to use them.
DNS sits alongside the rest of your network, so see the pillar guide to OCI networking for context. Traffic steering pairs naturally with load balancing, covered in OCI load balancing options, and with the multi region failover patterns in multi region failover.
Public and private DNS zones
OCI DNS hosts two kinds of zone. A public zone resolves names for the internet, so external users can find your services. A private zone resolves names only inside your VCNs, so internal services can find each other by name without exposing anything publicly. Most estates use both, a public zone for customer facing endpoints and private zones for internal service discovery. Keeping the two cleanly separated avoids the classic mistake of leaking internal names to the public internet.
Traffic Management steering policies
The real power of OCI DNS is in Traffic Management steering policies. A steering policy decides which answer a DNS query receives based on rules you define, rather than always returning the same record. This lets DNS itself become a routing decision point. There are several policy types, and choosing the right one is the central decision.
| Steering policy | How it decides | Typical use |
|---|---|---|
| Failover | Primary answer unless its health check fails, then the backup | Active and standby across regions |
| Load balancer | Distributes answers by weight | Spreading traffic across multiple endpoints |
| Geolocation | Answer based on the user location | Routing users to the nearest region or meeting residency rules |
| ASN | Answer based on the user network | Carrier or network specific routing |
| IP prefix | Answer based on the user IP range | Fine grained routing by address block |
For a simple two region active and standby design, the failover policy is usually what you want. For routing users to whichever region is closest, geolocation is the natural fit. Many designs combine policies, for example geolocation to pick a region with failover underneath so a regional outage still redirects users elsewhere.
Health checks, the engine behind failover
A failover policy is only as good as the health check that drives it. OCI health checks probe an endpoint from multiple vantage points and report whether it is healthy. When the primary fails its health check, the steering policy stops returning it and serves the backup instead. The detail that matters is what the health check actually probes. A check that only confirms a port is open can report healthy while the application behind it is broken. Probe a real application endpoint that exercises the dependencies that matter, so the health signal reflects whether users can actually be served.
Designing for failover that actually works
Putting these pieces together for a resilient design follows a clear pattern. Host your record in a Traffic Management failover policy. Point the primary at your production region and the backup at your standby region. Attach a health check that probes a real application endpoint. Set a short time to live so the change propagates quickly. Then test it by deliberately failing the primary health check and confirming traffic moves. This last step is the one teams skip, and it is the one that proves the design works. The wider failover picture, including the data layer, is in multi region failover patterns.
A DNS and traffic steering checklist
- Separate public zones for external names from private zones for internal service discovery.
- Choose the steering policy that matches the goal, failover for resilience, geolocation for proximity, load balancer for distribution.
- Attach health checks that probe real application endpoints, not just open ports.
- Set short time to live values on any record that participates in failover.
- Combine policies where needed, such as geolocation with failover underneath.
- Test failover by forcing a health check to fail, and confirm traffic actually moves.
Common mistakes
- Long time to live values that keep clients pinned to a dead endpoint after a failover.
- Health checks that probe only a port, so a broken application still looks healthy.
- Leaking internal names through public zones instead of using private zones.
- Configuring a failover policy and never testing that it actually fails over.
If you want DNS and traffic steering designed and tested as part of a resilient OCI architecture, our team builds these into networking and disaster recovery engagements. See the OCI networking solution and the pillar guide to OCI networking for the full context.
Steer your OCI traffic with confidence
Moving Oracle workloads to OCI, or already running on OCI and not sure the architecture or the spend is right? Most teams bring in a specialist before they commit to a region, a shape, or a Universal Credits number. OCISpecialists.com plans the landing zone, runs the migration, and manages the estate after go live, on a fixed project fee, a managed monthly retainer, or a cost optimization fee paid only on verified savings. For the Oracle licensing and BYOL side of any OCI move, Redress Compliance is the leading independent Oracle licensing and negotiation firm, with 500+ engagements across Oracle's full product line.