With the rise of AI has come vast investments in new data centers around the world. But still getting less attention is how all that AI infrastructure will be connected to itself and to everything else. But network operators are definitely gearing up to take on the challenge. With us today to talk about GTT’s approach is Gary Sidhu, SVP for Product Engineering.
TR: What is your background and how did it lead to your current role at GTT?
GS: I started my journey with Qwest, spending 24 years within the Qwest/CenturyLink/Lumen legacy. I started as a software engineer working on network automation. Then they had a huge project called Networx and workflow management became a big thing. We built our own workflow engine, which evolved into enterprise architecture. I was running a couple of Ethernet and fiber programs, which gave me good visibility into end-to-end API platforms. I spent the last couple of years running Lumen Digital, integrating with hyperscalers for AI and also building an SD-WAN SASE function. I joined GTT a year ago, and I think all this learning, building from engineer to architect to API platforms to running large customer segments to building overlay networks, has helped me in working on GTT’s Envision platform.
TR: What is GTT’s Envision platform, and how does it fit in the company’s future?
GS: The customer should be able to design their network quickly. We believe for us to achieve that, we need to have the right infrastructure underneath. There are three main components. EnvisionEDGE is compute infrastructure at the edge where you can virtualize and deploy your network and security functions. In EnvisionCORE, we have our own Network Function Virtualization (NFV) sites at 50+ PoPs. And then at around 26 of those PoPs we deploy SD-WAN gateways, orchestrators, and other larger software for Software Defined Networking (SDN). And then with EnvisionDX we provide this experience. Whether you are a GTT seller, a partner or a customer, you should have a complete visibility of your network, both how it is running and where you want to expand it. We built our EnvisionDX experience to be very site-centric, because it’s a very natural way for an enterprise to expand their network: a new branch, a new store, a new factory. We place a huge emphasis on the delivery part, because if you have infrastructure both at the edge and core functions of the network, then you can really cloudify delivering the services yourself.
TR: Why should network operators be optimizing their network infrastructure for AI workloads?
GS: There are six or so main drivers. First is speed. Companies that are building generative AI, training the models, and building AI applications are dealing with massive data and complex computations, and the network needs to handle data transfer quickly to avoid becoming a bottleneck. Real-time AI applications are very sensitive, and so latency is also critical.
Second, because these workloads are very unpredictable, they must be scalable. The network infrastructure must be able to meet the huge spike in the workload, especially if you’re doing model training.
Third, you need reliable stability because these workloads are long-running – they can take hours and days to train. Any network disruption could cause restarting the whole training process, which could get very expensive.
Fourth, these networks should be inherently secure. What measures you’re putting in place for compliance and cyber security are very important.
Fifth, edge computing is a priority because some workloads cannot be trained on one massive farm of Graphic Processing Units (GPUs). It could be for various reasons, including different languages, regionalization, etc. You want to have a distributed computing capability.
And at the end of the day, it is about cost optimization. We have to make sure that the network cost is not the longest pole in the tent. Having an efficient network design and management is critical.
TR: How is network infrastructure evolving to meet these needs?
GS: First, there is new specialized hardware and new protocols that are being developed. Generally, when you’re needing a speed of 400G, networks are reacting to these needs. Specialized switches are coming to minimize the traffic handling. There are also protocols, like remote direct memory access (RDMA), that allow multiple computers to share each other’s memory without needing the operating system. For network providers, this is a great opportunity for us to adapt to the new technology.
TR: How do network operators differentiate themselves in this new marketplace?
GS: One differentiator for us is performance guarantees. Providing the ultra-low latency, consistent bandwidth for heavy job training, and using techniques like network slicing to give GenAI workloads priority will definitely be attractive for large AI players. If you can give them a good SLA, that would be a differentiator. Providing customized connectivity with security and a threat detection also provides differentiation. Encryption and network segmentation become very critical, and I think you have to put so much emphasis on network segmentation that it becomes a natural part of how you design a network. Another thing is to give customers proactive monitoring of this workload. And then there is the ability to run workloads in a distributed environment. One other thing I didn’t realize until last year is how energy-intensive these operations can be. We’re building a more energy-efficient infrastructure connected with better clean energy resources, and I think that’s also an important part of the decision-making.
TR: In what ways is GTT leveraging Artificial Intelligence internally?
GS: We see two functions. Generative AI can be an assistant in EnvisionDX. We believe there is a use case for an assistant with different personas. You could be a seller that will be meeting a customer, and want to know everything about the company, their latest news, how their network is running, where their sites are, etc. You could be a sales engineer going to talk with a customer’s network team, for which you need to know other things. Or you could be a customer. We are building our own generative AI solution to assist with these different cases. The other use case for us is Agentic AI, which will be able to build automations in designing your network and your payments. We have identified 5-6 different use cases where we wanted to be able to build an Agentic AI virtual agent. So those are two main areas. We have already done a proof of concept and built our own GPU infrastructure, and we will be deploying all our AI workloads locally.
TR: At what stage do you think we are in the GenAI adoption process?
GS: The big GenAI players are fairly advanced, with more and more new models, functionality, reasoning, and human-like behavior. They are definitely ahead in the game. In the next 2-3 years, I think we will see demand building up from more enterprises building AI applications, such as the two use cases I just mentioned: assistants and agents. I read recently that only 6% of enterprises have an AI program established, so I think we are probably in the early stages of adoption and implementation. It’s two different communities. People building applications are behind. People building models are far ahead.
TR: What kind of time scale do you think we will see AI adoption occur over?
GS: I personally think it’s very hard to predict, but the journey has started. I think once that curve is up, then there’s absolutely no going backward. The foundational pieces, the models, are getting amazingly smart. People are starting to use them in day-to-day life. I feel like it will be maybe 1-2 years where people will be bringing up governments and enterprises. Once enterprises are building more AI agents, you’re bringing an infinite amount of users on our network infrastructure. Right now there are like six billion humans on the earth, but then when you have these machines using the same network there is no limit.
TR: Then what? What’s the next revolution after today’s AI cycle?
GS: I think the next level of evolution will happen with quantum compute, which probably is about 10 years depending on who you listen to. I think it will definitely be within our life cycle and will amazingly increase the speed. I think it will be sort of like breaking the sound barrier for the first time.
TR: Thank you for talking with Telecom Ramblings!
If you haven't already, please take our Reader Survey! Just 3 questions to help us better understand who is reading Telecom Ramblings so we can serve you better!
Categories: Artificial Intelligence · Industry Spotlight · Internet Backbones · SDN
Discuss this Post