The partnership with AI cloud provider Cirrascale will allow access to the world's biggest AI chip.
The second-generation wafer scale engine from Cerebras, built to accelerate large-scale AI workloads, is now available for public use in the cloud via specialist AI cloud provider Cirrascale. ACS-2 system, which houses the second-generation wafer scale engine, has been installed at Cirrascale’s Santa Clara, Calif., location.
Cerebras joins a handful of AI chip startups with hardware available for customer workloads in the cloud, including Graphcore and Groq.
Cirrascale, a cloud offering designed especially for AI workloads including autonomous driving and natural language processing, also has Graphcore hardware available in its cloud alongside Nvidia GPUs, plus AMD Epyc and IBM Power CPUs.
Cerebras said teaming up with Cirrascale is an important step in democratizing high-performance AI computing, by making it available to more types of customers. To date, Cerebras has wafer-scale systems installed in several academic supercomputers as well as pharma and other large on-premises enterprise data centers.
“It will enable a whole class of customers, many who are already customers of Cirrascale, and others… who want access to the system in a in a cloud infrastructure,” said Andrew Feldman, CEO of Cerebras.
The CS-2 features 850,000 AI-optimized compute cores, 40 GB of on-chip SRAM, 20 PB/s memory bandwidth and 220 Pb/s interconnect, fed by 1.2 Tb/s of I/O across 12x 100 Gb Ethernet links. Cerebras announced last month that a new memory extension system enables a single CS-2 to train models with 120 trillion parameters.
A key selling point is that as neural networks expand rapidly, reaching billions or trillions of parameters, the CS-2 is big enough to handle training networks on a single node.
“One little known fact about our industry is that few people can actually build big clusters of GPUs – how rare it is…. Not just the money, but the skills to spread a large model over more than 250 GPUs, [those systems] are probably resident in a couple of dozen organizations in the world,” said Feldman.
“Because this is a single device, it is much, much simpler to program than tens of thousands of GPUs,” added PJ Go, CEO of Cirrascale Cloud Services. “When you spread the workload among hundreds or thousands of servers you have a lot of network overhead, a lot of synchronization overhead that has to come into play, and frankly, that’s not accessible to many companies and individuals. [Cerebras CS-2], on the other hand – you take your model, you put it on a single CS-2 device, and you scale it up.”
The single CS-2 system available in the Cirrascale cloud will not be shared concurrently among users due to the nature of AI workloads requiring the highest possible performance to process workloads quicker, Feldman and Go said. They noted that customers generally can’t afford the 5-10 percent performance hit from a hypervisor.
“If you want to divide up a single GPU and spend $100 on a fraction of a GPU, you’re not our customer,” Feldman said. “We begin where you want a dozen, or 30, or 60 GPUs, to do the work you have. And at that point, it doesn’t make sense to divide up a machine. It makes sense to give you all of a machine, so you can do that work in a fraction of the time. So you can think about whether you want to do additional work in the time it would have taken you, had you done it on a different piece of hardware.”
The CS-2 will be available in weekly or monthly time slots starting at around $60,000 per week. Compared to the outright purchase cost of a CS-2 – “several million dollars,” though Cerebras has made subscription models available – Cerebras hopes weekly cloud access will make it more financially attractive.
“One of the benefits to customers is turning a big capex number into a monthly or weekly number that is much more easily digestible by customers,” said Cirrascale’s PJ Go. “We anticipate that we’ll get customers who want to try it for a week and review results, then either continue on the cloud or frankly, buy a CS-2 and put it on-prem.”
The Cerebras CS-2 instance in Cirrascale’s cloud is available now.
This article was originally published on EE Times.
Sally Ward-Foxton covers AI technology and related issues for EETimes.com and all aspects of the European industry for EE Times Europe magazine. Sally has spent more than 15 years writing about the electronics industry from London, UK. She has written for Electronic Design, ECN, Electronic Specifier: Design, Components in Electronics, and many more. She holds a Masters’ degree in Electrical and Electronic Engineering from the University of Cambridge.