SiCortex tips x86 ASIC for HPC clusters
Every good startup tries to practice a little technological jujitsu. By applying a bit of force in just the right place, a handful of people can knock a large and fast-moving industry sector slightly off balance, and open up a new and significant opportunity. That's just what SiCortex Inc. is attempting to do in the hot area of clustered servers for high-performance computing (HPC).
Today, most of the HPC sector is stacking off-the-shelf x86 processors into powerful piles that rival the most muscular custom-built architectures. The sector has adopted a combination of Linux and the message-passing interface (MPI) as its software platform. "We wanted to slide a new hardware architecture under that API that could be three to 10 times better," said John Goodhue, VP of marketing for the 30-person company.
Today's cluster-packed data centers "get an enormous amount of heat, so it takes a fair amount of money to run these installations," Goodhue said.
So the startup took the radical path of developing an ASIC that might have lower power, size and cost than competing x86 CPUs, yet runs Linux and MPI. SiCortex settled on a custom system chip that integrates six MIPS Technologies 64bit 5Kf cores running at up to 500MHz as well as a proprietary fabric switch architecture, two DDR memory controllers, a PCIe controller and 1.2Mbytes of L2 cache.
The resulting ASIC is in effect a cluster node on a 10W chip. It competes with off-the-shelf x86 boards that consume as much as 250W. "For anyone to compete with us, they'd have to integrate the fabric and memory controllers on board to have a similar story," said Matt Reilly, a co-founder and VP of silicon development for SiCortex. Reilly was a design team leader on the Alpha CPU at Digital Equipment Corp.
SiCortex packs 5,832 of the chips into a single cabinet that delivers 5.8 teraflops of performance. The SC5832 sports 8Tbytes of memory and 2.1Tbps of aggregate I/O capacity. It will consume 18kW and cost $1.5 million to $2 million or more, depending on how much memory is used in the configuration.
A lower-end system will use less than 100 CPUs to hit 648 gigaflops. The SC648 is designed for department-level users and offers 864Gbytes of memory and 240Gbps of I/O, while fitting in less than half of a single standard 19-inch rack. It consumes 2kW and can plug into a standard 110V wall outlet. It will cost $150,000 to $250,000, depending on configuration.
Users will probably have to recompile apps for the systems, given the slight tweaks on the Linux-and-MPI environment they support. "There are enough differences in various node configurations that people recompile code often in HPC anyway," said Goodhue.
SiCortex was formed by a handful of veterans from Digital Equipment. Its chairman is Ethernet pioneer Bob Metcalfe. The startup has raised about $42 million in two venture rounds to date.
|Compute node on a 10W chip packs six MIPS 5Kf cores and a proprietary switch fabric.|
Besides the whole industry of conventional x86 cluster vendors, SiCortex will compete with another startup, Panta Systems Inc., which is taking an opposite approach. Panta implements up to 32 AMD Opteron CPUs in a cluster-in-a-box based on Infiniband that uses only off-the-shelf chips. Other startups that may try to emulate the SiCortex concept are using IBM Power or Sun Sparc CPU architectures, which those companies are just starting to license.
Process of elimination
SiCortex chose MIPS in part through a process of elimination. No x86 cores were available for licensing, ARM lacked a 64bit part, and Power and Sparc were not available at the time the startup began work.
The company found the MIPS 5Kf extremely small, at just 3-by-2mm for a block. As a CPU that had been used in various consumer devices, the part also had a solid history with Linux.
Reilly's team added a proprietary switch fabric to the core that provides three 2GBps links to each chip. It strictly enforces an in-order delivery of packets to keep the software protocol simple.
Although each chip has a PCIe core, only one in three of those cores is routed to system I/O. The system typically has 36 modules, each of which sports two GbE ports and three Express slots. SiCortex is working to ensure that standard Fibre Channel, Ethernet and Infiniband adapters from third parties are available for the system.
The chips underwent packaging and test in November. SiCortex was scheduled to bring up its first small prototype systems in December. Commercial systems will be available early this year.
Reilly said creating a design team from scratch and developing its own tool flow were the biggest challenges for SiCortex.
- Rick Merritt
|Related Articles||Editor's Choice|