Being able to rapidly troubleshoot and resolve performance degradations is crucial for communications service providers (CSPs). If they can’t, it not only puts their brand reputation at risk, but also their ability to generate revenue from enterprise services backed by service level agreements (SLAs).
However, not all CSPs have the tools that they need to troubleshoot their networks effectively, particularly when confronted with intermittent performance degradations—the most difficult type of issue to diagnose and resolve. CSPs without these tools are often faced with a lack of relevant contextual data necessary to bring swift resolution.
What is required for effective troubleshooting? It boils down to quality over quantity.
When troubleshooting mobile service issues, CSPs are most effective when they can quickly find the root cause, using the fewest resources (data, tools, personnel) possible. The ability to take raw data from one or more production systems and network elements and rapidly develop insight from it is what enables action.
But typically, CSPs are limited by the type (wrong, not enough) and volume (too much, not relevant) of data available: call data records, events and traces. At best, these data sources provide an incomplete picture of network activity and performance, as well as create an overload of ‘big data’ that lacks context.
So what’s required to create that context? Full end-to-end network visibility.
Full core-to-RAN insight
Full core-to-RAN insight means seeing what’s happening everywhere along the service path: from the center of the network (core, EPC), along the metro and access segments, then out to the base transceiver station (eNodeB, GNodeB). To achieve this, CSPs need the ability to consult and interact with any individual network element along the service path to extract performance data or fault data.
But, just having access to such data doesn’t equate with insight about what’s happening and why. An end-to-end view is important, but if that’s all you have, an important element is missing to support effective troubleshooting.
To understand what it really means to shed light on or help solve a problem—to see the underlying truth, as it were—consider the notion of ‘the truth in situ (on site)’ compared with perceived truth from a distance. Truth on the ground, at the site, is the gold standard.
Unfortunately, most CSPs do not have the tools to gain that in situ truth; they rely on remote performance analysis systems. Often these systems provide not much more than a time-stamped alert to indicate what happened at a given point of time: a picture of the network in a fault state. But what you really want is the history—the movie—including the seconds leading up to the incident. You want to know what was happening immediately prior to the performance degradation occurring. You want context.
Performance data with context
When diagnosing a performance degradation, you should be able to answer:
- When did the network fault or performance degradation occur?
- What happened in the seconds or minutes preceding it?
- How many subscribers were affected?
- Which subscribers or groups of subscribers were affected?
- Which network RAN and core elements were providing service to the user?
- What devices were they using?
- What were they doing at the time?
- Which service or application were they accessing?
- Were they in an indoor location or moving at the time?
To the layman, these seem like perfectly reasonable questions that anyone trying to solve a problem would like to have answered. But most service assurance tools used by CSPs don’t offer this type of relevant data.
So where can CSPs find the missing contextual data? Beyond traces, events and call data records sourced from network elements, other sources to aid diagnostics include end-to-end message flows and XDRs, data from customer relationship management (CRM) systems and data sourced from end-devices.
Analyzing this additional data helps pinpoint affected subscriber(s) or group(s), time of day, location (determined via GPS or event trilateration), device used, network type (2G/3G/4G/5G/NB-IoT), and service/application (e.g., VoLTE, web browsing).
EXFO refers to this rich contextual data as multi-dimensional, geo-located, subscriber-centric analytics.
With that data in hand, it’s possible to begin to effectively troubleshoot performance degradations across the network, and plan for the future with confidence.
Benefits of contextual insight
When operations teams have access to rich, contextual data they can:
- Prioritize resolution efforts based on the impact of performance degradations, using insight and context about affected subscribers or groups as a guide.
- Resolve degradations faster by having clear insight into the root cause, how to fix it and how to prevent similar issues in the future.
- Work better as a team by having a clear idea of what to fix and who needs to fix it, ending the finger-pointing.
- Reduce recurrence of the same issues by using multi-dimensional analytics to flag future issues faster and use virtual drive testing (much more efficient than real-world drive testing).
- Plan more effectively by understanding the impact of deploying new radio technologies or devices, adjusting power levels, or making changes to beam sectors and hand-offs.
Rich, contextual data is useful for both troubleshooting and planning—not only in theory, but in the real world.
Contextual insight in the real world
In practice, the benefits of rich, contextual data are clear. Last November, EXFO conducted a survey of CSPs that revealed insights about the realities operators face :
- 74% of CSPs require 3 or more tools to analyze complex RAN issues
- 59% of RAN specialists lack subscriber and handset analytics
- 48% of operators lack the end-to-end visibility to optimize VoLTE
Not all CSPs currently have access to the context information needed to effectively troubleshoot the performance of their networks or the services that run on them. But they could.
Effective troubleshooting from the core to the RAN is imperative for service providers to deliver on their commitments. While end-to-end coverage is key, it is not the end of the discussion. As we have seen, it’s about getting the right data, at the right time, in context. Context leads to insight. Insight leads to action.
Learn more about Nova RAN, the network planning, troubleshooting and optimization solution that leverages geo-located subscriber data.