The consequences of AI merging into business operations are leaving CIOs under pressure for the best results. Combine this with customers desiring real-time responses, and Edge Computing is emerging as a relief source for secure, efficient and accurate AI adoption. Andre Reitenbach, CEO, Gcore, tells us more.
The roots of AI date back decades but with the introduction of Open AI almost everyone was given an easy tool to optimise both AI and Machine Learning and many took advantage of it. In the last two years, advances in AI have seen an increasing number of companies and start-ups build Large Language Models to improve their business operations and use AI to gain enterprise-level insights to improve decision-making.
The pressure is now on CIOs to embrace AI more broadly across applications and business processes, deliver more personalised customer experiences and better manage sensitive data. But AI is not something you can pull off the shelf and plugin. Integrating AI into the fabric of a company requires a deep understanding of the infrastructure needed to underpin it, and the demands it will make in terms of both compute power and energy usage as models grow.
According to the World Economic Forum, achieving a tenfold improvement in AI model efficiency could see computational power demand surge by up to 10,000 times, while the energy required to run AI tasks is accelerating at between 26% and 36% every year.
Running alongside this is the desire to achieve a response in real-time which is where AI at the Edge has become so important. Apple, for example, has just announced that Siri will be powered by a Generative AI system allowing it to chat, and not just to respond one question at a time. Of course, if you are an Apple device user you will expect to enjoy this feature regardless of where in the world you are located, which means it entirely depends on global, Edge AI deployment, removing the risk of latency involved in sending data back and forth to a remote cloud or data centre.
The other key issue for CIOs to consider is privacy. While we live in a globalised world, leaders are becoming increasingly concerned about data sovereignty and ensuring the protection of the data that their citizens collect or store without impeding their business operations. The question is not just how AI systems manage and use data for a particular sector, such as healthcare or finance, but whether the infrastructure can run models for any given location whilst remaining compliant with the regulations in that region. AI at the Edge has the advantage of offering improved privacy and security, retaining sensitive data locally.
The key steps to preparing for Edge AI
Training and inference
There are typically three stages that companies go through when developing AI models: training, inference and distribution. Companies need powerful computing resources like GPUs to train large AI models on massive datasets. The world’s top tech giants use hundreds of thousands of GPUs for this training phase, but CIOs starting on the journey can be more conservative, in scaling their models in time and as AI usage becomes more complex.
Once the model is pre-trained, inference can be run on it, which involves taking the model and using it to generate outputs like text, images, or predictions based on new data inputs. This inference stage often happens in the cloud and requires significant but less extreme computing power than the training phase.
Finally, to serve end-users worldwide with minimal latency, the inferenced model outputs need to be distributed globally. Extensive content delivery networks (CDNs) with Points-of-Presence (PoP) across the world (our own has more than 180 PoPs) are best positioned to assist with this, delivering close to end-users using AI at the Edge. The closer an end customer is to a PoP, the faster they will be able to interact with the AI model.
Low latency and costs
Like other digital services, AI applications require low latency to deliver a responsive user experience. This was initially driven by the needs of the gaming, e-commerce, finance and entertainment sectors. AI follows this same trend, benefitting from secure, high-performance global networks and Edge compute resources.
While the centralised cloud provides immense compute power for training, it is now struggling to compete against other providers when it comes to privacy, low latency and cost efficiency. CIOs would benefit from AI inference distribution that can cost-effectively serve the less intensive inference workloads to end-users on an on-demand, pay-as-you-go basis. These are calculated against the GPUs needed, allowing businesses to start with basic GPUs and scale resources elastically based on usage.
Use cases and challenges
Enterprises are still understanding AI’s potential use cases across industries such as automated systems, robotics, HR and customer service chatbots. But they need to weigh factors like data control, compliance, skills and budgets.
Large Language Models like ChatGPT offer powerful but generalised AI capabilities. For production use, enterprises may prefer fine-tuning such models on their proprietary data for greater relevance, accuracy and control. They can leverage open-source models or contract AI experts to develop customised, private models.
Key considerations for CIOs include understanding the specific use cases that are appropriate to their business and the desired outcome; evaluating make-vs-buy options for model development based on in-house AI expertise and available resources; choosing the right scalable infrastructure for training and inference; deciding the best route to ensure low latency; handling data governance and privacy; and optimising costs.
Overall, while AI presents transformative opportunities, operationalising it requires careful planning around technology infrastructure, economic models, use cases and data strategies. Working with specialised AI cloud and consulting providers who understand and can meet expectations for delivering AI at the Edge, can accelerate effective AI adoption.