Ampere Computing and Scaleway Introduce Cost-Optimized Arm-Based Servers for AI Services
Ampere Computing and Scaleway, a French cloud operator, have joined forces to promote the use of Arm-based servers for AI-based services, particularly inferencing. The companies announced the availability of cost-optimized COP-Arm instances at the ai-PULSE conference in Paris. These instances are operated by Scaleway using servers powered by Ampere’s Altra family of Arm-based datacenter processors.
Cost-Efficient Solution for AI-Driven Applications
Scaleway claims that the COP-Arm instances are specifically designed to meet the demands of AI-driven applications, such as chatbots, real-time analytics, and video content analysis. The company asserts that these instances can be delivered at a fraction of the cost of other solutions, making them a cost-efficient alternative for businesses.
“With Ampere Altra Processors, we are offering businesses a powerful and cost-effective alternative, enabling them to achieve high-performance results in the most efficient and sustainable way possible,” said Scaleway CEO Damien Lucas.
Addressing the Needs of Inferencing Workloads
While the demand for powerful servers with top-end GPUs has been on the rise for generative AI, Ampere’s chief product officer, Jeff Wittich, highlights that this infrastructure is primarily useful for training models. However, when it comes to inferencing workloads, supercomputing hardware is not necessary.
“Often when we talk about AI, we forget that AI training and inference are two different tasks,” Wittich explained. He emphasized that inference, which involves running scale models continuously, can be efficiently handled by general-purpose CPUs. Efficiency becomes more important in inferencing, as it is a continuous process.
Power Efficiency and Cost Comparison
Wittich cited an example of running OpenAI’s Whisper generative speech recognition model on Ampere’s 128-core Altra Max processor, which consumes 3.6 times less power per inference compared to an Nvidia A10 Tensor Core GPU. However, the exact cost-efficiency of Scaleway’s COP-Arm instances is yet to be disclosed, leaving potential customers curious about the pricing details. Ampere shared a testimonial from an early customer, Lampi.ai, a Paris-based AI copilot platform developer. The company’s CEO, Guillaume Couturas, claimed that using COP-Arm provides 10 times the speed at a tenth of the cost compared to competitors using x86 architectures.
Ampere’s recent launch of the AI Platform Alliance, which aims to create an open AI ecosystem to challenge Nvidia, demonstrates the company’s commitment to advancing AI technology. Additionally, Ampere introduced the more powerful AmpereOne processor this year, which consumes slightly more power than the Altra family but offers enhanced performance.
Photo: Scaleway.com