November 7 2013

The Need for End-to-End Network Visibility

As the telecommunication services evolve with the subscribers’ applications in today’s networks, there is a growing disconnection between the perception of quality of the services the operators provide, and the actual quality perceived by their customers. As Ericsson points out on a recent paper, “The quality of user experience has always been important for operators. However, with the rise of mobile-broadband and smartphone usage over the past few years, the meaning of user experience has changed dramatically”. The issue has to do with the lack of visibility that the operator has beyond their network devices in the Radio Access Network (RAN) and the Core Network, where they cannot measure the actual connectivity the customers have on their ends. Measuring the cells power or the aggregated throughput in the core, is not the same as knowing that an important customer has a weak coverage, or his application is timing-out because of high latency in the communication.

Considering the Internet of Everything (IoE) or Machine-to-Machine (M2M) applications there is also a regulatory edge with the definition of Service Level Agreements (SLAs) signed with the so-called vertical customers, or those companies who provide the M2M services to end subscribers using the operators’ networks for the communication (e.g. transport, security, utilities companies, among many others). The SLA’s define the “rules of the game” for the services to deliver, and these can only be based on objective and measurable Key Performance Indicators (KPI). Finding a balance between the customers’ services requirements and the operators’ ability of committing to these is often a big challenge, considering the quality of service gap commented before.

During a meeting held this month, a M2M Manager for an important tier-1 operator in Europe told BlueTC: “We (the carriers) can only define the SLA’s based on network KPIs, typically Core and RAN indicators for performance and quality, but a M2M customer does not understand the actual service in these terms”. These indicators referred are often global average percentages of network availability during a year, and average throughput and transactions per second measured in the Core, among others. In addition, the same operator measures and reports the indicators directly defining the SLA contracts’ compliance, in other words acting as both a player and the referee.

The customer however, understands the quality of the service delivered in the terms their usage gets affected. That is for an end subscriber having service every time he is using his devices, with a good coverage, with the agreed speed or better, with no errors seen in the applications used nor cuts for the browsing experience, without any dropped calls, and without delays overall. For a vertical customer this could require an increased high availability, given that any failure or delay in the service would represent a monetary loss, or not being able to fulfil a critical function. Imagine some M2M cases, where a temporary failure in an e-health service communication means not being able to send vital data during a medical emergency. Consider an automated security system, where a faulty coverage leads to missing the notification of a robbery in progress in a home or a store, or the lack of confidence in the solutions provided because of these issues.

Having the SLA’s defined as percentages of the network availability, or number of outages due to big issues during a year, the resultant granularity is not enough for the customer to cover the critical cases commented. For example, an outage in a P-GW for a region could in average be compensated in a SLA by the fact the networks for the other regions are not affected, but in reality this would represent big losses for all the customers affected in that specific region. Having the SLA’s in the other hand defined as granular as not affecting any end-subscriber application service with the telco grade five nines (99.999% availability), would not be feasible for any operator in the world. For instance, any storm temporarily affecting the coverage, or any fibre cut due to an unplanned work on a road not related to the operator would unfairly break that SLA. This without mentioning that the operator does not really have direct visibility over any of the end subscribers’ applications services.

In order to bridge the gap commented the operators must look to implement innovative mechanisms for measuring the actual service perception of the customers on their ends. Systems that measure and report connectivity KPIs from a user perspective, including indicators like live signal, throughput, latency, packet loss, jitter among others could provide the operators with the real time insights needed. Such monitoring systems should look to complement an end-to-end visibility needed by the operators today, useful also for defining the SLAs in common and feasible terms for both parties. It is easy to imagine a SLA defined for a M2M vertical customer in terms of a specific number of devices achieving certain percentages of the KPIs commented, and thus getting a sense of a real and evolved quality of experience in the networks connectivity.


