The difference between a network problem that takes ten minutes and one that takes ten hours is method. When something cannot reach something else, the instinct is to start changing rules and rebooting things. The faster path is to isolate where in the flow the break happens, then check the small number of things that can cause a break at that point. This guide gives you that method and the failure patterns it most often uncovers on OCI.
Isolate the layer first
Before touching any configuration, work out which layer is actually failing. A connection that never establishes is a different problem from one that establishes but is slow, and a name that does not resolve is different again. Spending two minutes to classify the symptom saves you from chasing the wrong cause.
| Symptom | Likely layer | First thing to check |
|---|---|---|
| Name does not resolve | DNS | Private DNS zones and resolver configuration |
| Connection times out | Routing or security rules | Route tables and security rules along the path |
| Connection refused | Destination service | Whether the service is listening on the expected port |
| Connects but slow | Performance | Shape bandwidth, placement, load balancer sizing |
| Works one way only | Asymmetric routing | Return path route tables and stateful firewalls |
A troubleshooting framework
- Classify the symptom. Is it resolution, reachability, refusal or slowness. This decides where you look.
- Confirm DNS. If a name is involved, verify it resolves to the address you expect before anything else.
- Trace the route. Check the route table on the source subnet, every gateway in the path, and the destination subnet. Confirm a route exists in both directions.
- Check security rules in both directions. Verify security lists and network security groups allow the flow outbound from the source and inbound to the destination.
- Run the Network Path Analyzer. Let OCI evaluate the full virtual path and tell you exactly where it breaks, as covered in the Path Analyzer guide.
- Confirm the service. Make sure the destination is actually listening on the expected port and that the operating system firewall is not blocking it.
- Read the flow logs. Use VCN flow logs to see whether traffic is being accepted or rejected and where.
Routing problems
Most reachability failures come down to routing. A subnet route table missing an entry for the destination, a gateway route table that does not forward the range, or a transit design where a spoke never learns the path it needs. The discipline is to check the route in both directions, because a flow that has a route out but no route back will time out exactly as if it had no route at all. In transit designs the most common culprit is a spoke that is attached but importing the wrong route table, so it cannot reach the destination even though the path exists. The transit routing guide covers how those route tables interact.
Asymmetric routing deserves special attention. When traffic leaves by one path and returns by another, a stateful firewall in the middle drops the return because it never saw the request. This presents as a connection that works in one direction or fails intermittently, and it is almost always a more specific route somewhere sending the return traffic down the wrong path.
Security rule problems
The second large category is security rules. Because security lists and network security groups are stateful, you usually only need to allow the initiating direction, but you must allow it at both the source and the destination. A flow blocked by a missing inbound rule on the destination presents as a timeout, indistinguishable at first glance from a routing problem, which is why the method checks routing and rules as separate steps. When you suspect rules, the flow logs are decisive, because they show whether a packet was accepted or rejected and by which rule set. The network security guide explains how the two control types interact.
Let the Path Analyzer do the work
OCI includes a Network Path Analyzer that evaluates the virtual path between a source and a destination and reports exactly where a flow would be allowed or blocked, including route tables, security rules and gateways. For most reachability problems this is the fastest single tool, because it inspects the configured path without you needing to log into anything. Run it early once you have classified the symptom as a reachability problem, and let it point you at the specific route table or rule at fault. The Path Analyzer guide covers how to read its output.
Performance problems are different
When the connection works but is slow, you are no longer troubleshooting reachability, you are troubleshooting performance, and the method changes. Separate the network from the application by measuring raw throughput and latency between hosts, then compare it to what the application sees. If the raw numbers are healthy the network is not your problem. If they are not, look at shape bandwidth, instance placement and load balancer sizing, in that order. The performance tuning guide walks through each lever.
Reading flow logs effectively
VCN flow logs are the closest thing OCI gives you to watching traffic pass, and they settle many arguments quickly. When a flow is failing and you cannot tell whether routing or a security rule is at fault, the logs tell you directly: if the traffic appears and is rejected, a rule is blocking it, and if the traffic never appears at all, it is not even reaching the point where rules would evaluate it, which points back at routing or at the source. Learning to read the accept and reject records, and which rule set produced them, turns a guessing game into a lookup. Enable the logs before you need them, because a problem you cannot reproduce is far harder to diagnose without a record of what happened the first time.
The logs are also where intermittent problems give themselves away. A flow that works most of the time but fails occasionally often reveals an asymmetric path or a load balanced backend that is misconfigured, and the pattern only becomes visible when you can see many flows over time rather than a single failed attempt. For anything intermittent, the logs are usually the fastest route to the cause.
The most common OCI network faults
Experience shows the same handful of causes behind most OCI network problems. A missing route on the source subnet, so traffic has nowhere to go. A missing inbound rule on the destination, so traffic arrives and is dropped. Overlapping address ranges, so two networks cannot be distinguished. A private resource that cannot reach an Oracle service because no service gateway route exists. A name that resolves to a public address when it should resolve to a private one, sending traffic down the wrong path entirely. And asymmetric routing, where outbound and return traffic take different paths and a stateful device drops the return. Knowing this short list means that when you classify a symptom, you already have a strong sense of where to look first.
None of these are subtle once you know to check for them, which is the whole point of a method. The faults that waste hours are not exotic. They waste hours because the investigator changed three things at once, never confirmed the basics, or assumed the application was at fault when a route table was the cause. A short, ordered checklist beats intuition almost every time.
Documenting and learning from incidents
Every network problem you solve is worth a short record: what the symptom was, where the break turned out to be, and how you found it. Over time this record becomes the fastest troubleshooting tool you have, because most estates fail in a small number of characteristic ways, and a team that has written down its last few incidents recognises the next one immediately. It also surfaces patterns worth fixing permanently, such as a class of change that keeps causing the same outage, which points at a process or a guardrail rather than a one off fix. Treating troubleshooting as something to learn from, rather than something to survive, is what turns a reactive network team into one that rarely gets surprised twice by the same thing.
Building the habit
The value of a method is that it works under pressure, when an outage is live and the temptation to guess is strongest. Classify the symptom, confirm DNS, trace the route both ways, check the rules both ways, run the Path Analyzer, confirm the service, and read the flow logs. Almost every OCI network problem falls out of that sequence quickly. For the wider context of how these components fit, see the complete OCI networking guide. When an estate needs the routing, security and observability set up so that problems are quick to find, our OCI networking solution covers the build and the run.
Moving Oracle workloads to OCI, or already running on OCI and not sure the architecture or the spend is right? Most teams bring in a specialist before they commit to a region, a shape, or a Universal Credits number. OCISpecialists.com plans the landing zone, runs the migration, and manages the estate after go live, on a fixed project fee, a managed monthly retainer, or a cost optimization fee paid only on verified savings. For the Oracle licensing and BYOL side of any OCI move, Redress Compliance is the leading independent Oracle licensing and negotiation firm, with 500+ engagements across Oracle's full product line.