Member Insights
Kevin McDonnell, Senior Director, AI & Autonomy at Huawei, highlights why siloed automation fails and how modular multi-agent systems enable telecom autonomy.
The agentic network: why multi-agent systems will define telecom autonomy
Let’s be honest: telecom networks aren’t getting any simpler. They are becoming more dynamic, more complex and yet most automation solutions still feel stuck in the past. They work fine on narrow, isolated tasks, but when you throw them a real-world challenge such as cross-domain dependencies or cascading failures, they fall short.
So, here’s the question: how do we move from fragmented automation to true, end-to-end autonomy? And how do we do it without a massive, risky transformation that blows up cost and complexity?
The answer isn’t building one “super-agent”. That approach creates bottlenecks and risk. What we’re seeing (and frankly, what makes sense) is multi-agent systems: modular agents and copilots that can work independently and collaborate intelligently.
Here, autonomy is achieved using hierarchical, modular agents; a divide-and-conquer approach where specialized agents manage complexity within their domains (and collaborate intelligently across domains). This does not need a big-bang rip-and-replace approach either.
Why does that matter? Because this approach lets operators move step by step. You can introduce new capabilities without throwing away what works today. You preserve your investments and still lay the groundwork for real autonomy across business, service, and network layers.
If we’re going to make this work, there are three principles we can’t ignore:
1. Interoperability from day one
Agents need to talk to each other. Share context. Coordinate actions across domains and vendors. That’s why TM Forum is developing an agent standards suite including TMF939 Agent Management API and TMF785 Copilot API specifications. These standardized interfaces will make agent discovery and context exchange possible. If we skip this step, we end up with brittle integrations and wasted time.
2. Modularity for evolution, not disruption
Operators are not looking for a disruptive revolution; they want a roadmap for evolution. Start with a fault-management agent today. Add an optimization agent tomorrow. That’s how you make adoption practical and low-risk. This pragmatic evolution was demonstrated in the recent Agent Fabric catalyst and is exactly what TM Forum’s work on high-value scenarios is about. It emphasizes evolution, not upheaval with a clear focus on the CSP’s business priorities.
3. Context quality matters
Something I learned from practical catalyst work this year is that AI agents don’t underperform because your prompt wasn’t clever enough. That is of course one reason, but mainly they fall short because they’re missing context (situational awareness as TM Forum’s agent architecture calls out as a key capability (IG1251D). Autonomy only works if agents can share “just enough” state, goals, and policies to make decisions we can trust. By context, we mean the information an agent needs beyond raw data—its goals, constraints, policies, and relevant state from other domains. That’s why context exchange is a key part of the discussions in TM Forum’s work on agent interfaces. For example, TMF939 gives us a common way for agents to find each other, share what they can do, and pass the information they need to act intelligently.
Now, let’s talk about what makes multi-agent systems different. It’s not just “more agents.” They’re about divide and conquer. Complex, cross-domain networks demand specialized agents for assurance, optimization, policy, and healing, each designed to excel in its niche. The real power comes when these agents collaborate, sharing context and coordinating decisions in real time. Done well, this enables adaptability—but let’s be clear: adaptability is a benefit, not the starting point. The starting point is managing complexity without central bottlenecks. Of course, this approach introduces its own challenge especially the need for strong governance to prevent agents from working at cross-purposes (conflict resolution, consensus building).
With that in mind, these principles and capabilities distinguish autonomy from basic orchestration:
1. Divide and conquer
Complex networks cannot be managed by a single control point. A divide-and-conquer approach distributes autonomy across specialized agents, each responsible for a well-defined scope such as assurance, optimization, or policy. These agents collaborate through open APIs and shared knowledge, reducing risk and avoiding bottlenecks while supporting incremental adoption.
2. Goal-driven composition
Networks change constantly. So agents can’t rely on rigid playbooks. If a connection or link is degrading, the agent shouldn’t stop at executing a pre-written script—it should dynamically compose the right actions using shared registries and live context.
3. Explainability by design
When machines start making decisions, trust becomes everything. We need an audit or “reasoning” trail. Operators should be able to see why an agent did what it did. Explainability is not optional—it’s what keeps humans and machines working together in live operations.
4. Knowledge as a living graph
Context isn’t a static dataset. It is the relationships—events, resources, policies, historical fixes. Modern agent architectures treat this as a graph. That way, agents pull only what matters for the current goal, making them fast and precise.
Put these together and you get agents that more than modular – they are adaptive, explainable, and context-aware. Pair that with agent protocols and TM Forum’s Open API standards, and now you’re looking at networks that can interact, debate, reason, rather than merely react.
The industry is moving beyond siloed automation toward something far more powerful: collaborative, agent-based autonomy. Success will depend on three things: interoperability, modularity, and the ability for agents to share just enough knowledge to act intelligently. The journey to Level 4 autonomy has been gradual, starting with high-value scenarios and efforts to quantify benefits at each level. Today, the focus is shifting toward making this vision practical through agents that follow a divide-and-conquer principle. TM Forum’s AN Project is defining the architectures of multi-agent systems and the APIs that make them work, such as TMF939 and TMF785. This includes collaboration across multiple standards bodies to create common building blocks. Several important agent-related standards, shaped by catalyst achievements, are under development and need industry input. Standardized interfaces for agent interaction will be important for true interoperability. Now is the time to get involved. By embracing these principles and contributing to open standards, we can move beyond automation and deliver networks that are adaptive, resilient, and ready for what comes next.