Ideas on how communications service providers (CSPs) can utilize AI were in abundance at TM Forum's Innovate Americas 2024, and they all rely on good data, as Sreedhar Rao, Global CTO for Telecom with Snowflake, explains in an interview.
Innovate Americas 24: Want your AI to work? Sort out your data layer
Data experts differ on whether there are five, seven, nine, or more, essential measures of data quality. Two characteristics that consistently make the list, however, are data completeness and accessibility. That’s because without them companies build technology silos, as some telcos are discovering afresh with the latest round of AI technology deployments.
“Everyone has data in silos across the organization and they’re doing analytics only with the data they have in their system, so they’re missing a big piece of information,” says Sreedhar Rao, Global CTO for Telecom with Snowflake.
Rao explains this problem became worse in the first round of AI proofs of concept. He observes that many CSPs now find themselves with multiple AI silos because there has been a rush within most to roll out proofs of concept that demonstrate to executive boards that the company is developing real expertise with AI and GenAI. Often the result is multiple AI programs, each using different vendors, data sets, and architectures.
Now, says, Rao, nearly every CSP he talks to around the world “wants to be an AI telco.” One of the major motivators is being able to do predictive network management and solve problems before they occur. But this is a data-intensive practice that is undermined by the way many CSPs manage their disparate and voluminous data silos today, contends Rao.
“People are collecting the data for 5 to 7 days for operations needs and then they dispose of it because they can’t keep it. So, you aggregate data and then only have aggregate data. And if I want to do trend analysis, I have big holes,” says Rao. This makes it very difficult to do predictive work accurately. “If the only data I can use for prediction is that 7 to 10 days of raw operations data to run an AI telco, that won’t be enough; you need that granular data or you’re making decisions on macro-level datasets,” Rao explains.
More granularity equates to higher data volumes, which calls into question CSPs’ traditional adherence to locating network data and operations on their premises[JT1] .
Rao says CSPs have several concerns with taking network operations off premises and into a public cloud, ranging from a reluctance to disrupt existing systems, network security concerns and the bill shock of early, and non-optimized forays in public clouds.
At the rate at which data volumes have grown and will continue to grow, however, the on-premises approach becomes a cash sinkhole that only grows deeper over time, believes Rao: “When you do analytics, it’s not one type. It’s a lot of different analytics and each needs a CPU cycle and data. So, you don’t have enough capacity or bandwidth to manage that in an on-premises system; it’s a losing battle.”
It doesn’t end there. As the industry progresses through technology generations, like 4G to 5G, data volumes increase by an order of magnitude or more, according to Rao.. “4G was hundreds or terabytes, but 5G is petabytes,” and 6G volume will be even greater, Rao predicts. Similarly, where radio access networks (RANs) in 4G dealt with hundreds of key performance indicators (KPIs), 5G has thousands of KPIs “because people figured out these are the metrics we need to optimize,” Rao says.
To manage these rapidly escalating data volumes driven by factors like 5G networks and AI adoption, Rao says, “you need a combination of on-premises and cloud. No one is as auto-scalable as the hyperscalers, so let them do it,” he advises. And once access to data is solved, it has to be analyzed to be of value “so you need more CPUs, more processing, and that’s not cheap.” Rao explains that CSPs typically need to design their on-premise architecture to accommodate peak scale, but this means paying for resources that aren’t being used most of the time.
Rao says that CSPs are also learning quickly that hyperscalers may deliver all the building blocks but “you still have to build all the solutions out.” And one of the most critical solutions for making an AI telco work is a highly efficient data layer that democratizes access to data across the telco.
When CSPs employ a data layer that enables CSPs to manage the use of disparate and differently modelled data across their business, whether on premises or on a private or public cloud, they tend to see a rapid increase in the number of users working with data from all over the enterprise, explains Rao. But they also sometimes make the mistake of running the data layer like an on-premises solution and, as a result, fail to optimize all underlying factors that will add to overall cost. “In an on-prem, you can launch a VM and just leave it running. But we say you don’t need to because that’s an escalating cost you don’t have to pay.”
Rao encourages CSPs to learn how to optimize all aspects of managing their data loads, with the aim of delivering 5 to 10% optimization every year, so that “the same query today will run better and cheaper next year.” He cautions, however, that CSPs should not expect their overall investment in networks and AI to decline. Instead, the costs they will face due to escalating data volumes will be made tenable while continuously improving the efficiency of the entire data layer.
CSPs also need to share data with each other, their partners, and regulators, says Rao who explains that service contracts between CSPs and network vendors require them to exchange “terabytes of data” at times.
Ideally, CSPs and their partners require data environment “where no one loses control of their data and nothing crosses out of its own domain,” Rao says. At the same time, CSPs and their partners want to avoid recreating a “massive infrastructure to analyze all the data”. Instead, the shift should be towards bringing “the application to the data, not the data to the application.”
In the end a data layer’s purpose is to make data available and consumable across ecosystems; to solve classic problems like reconciling disparate data models; and to manage data transport costs while ensuring unutilized resources, like virtual machines, are not generating unnecessary costs. One way or another, these are the types of issues CSPs will face when deploying AI at scale and promoting its use company-wide.