Microsoft and partners released open-source RTL for a new data compression scheme, and Intel said that another effort may do the same for a security block.
The projects were indicators of the depth and breadth of work to drive the world’s largest data centers forward. It comes at a time when Moore’s Law is slowing and workloads such as deep learning are growing, forcing engineers to pull out all the stops in their quest for performance.
For example, vendors showed a half-dozen alternative approaches to cooling fast chips, including immersion baths. With dozens of hot processors and accelerators in the pipeline, one executive said that he hopes an OCP committee can draft standards by next year for the area.
More than 3,500 engineers registered for the annual OCP event. The group has 178 members said to spend $2.56 billion a year on data center gear, forecast to rise to nearly $11 billion by 2022.
Since its founding by Facebook in 2011, the group has released dozens of open-source designs for servers, switches, and other systems and boards. Representatives said that they hope the new projects are just the start of work on open silicon.
“Among cloud providers, we are first to … set a new precedent of contributing RTL, which I hope others will follow … For a new compression standard, you need to seed the whole industry — you want a lot of silicon,” said Kushagra Vaid, general manager of server engineering at Microsoft.
Accordion man: Intel’s Jason Waxman holds up a four-way server Intel co-designed with Inspur in the race for power, and he said the x86 giant will ship Nervana training and inference processors this year. (All images: EE Times)
Project Zipline is a response to the data flood expected to reach 175 zettabtyes a year by 2025, according to one recent study. It defines a variation of Huffman encoding optimized for data centers and implemented in a pattern-matching IP block. It slashed the size of test files from Microsoft by 92% to 96% while handling throughput in tens of gigabytes/second at microsecond latencies.
Vaid acknowledged that rolling out a new compression technology will take time. So far, Zipline backers include AMD, Arm, Broadcom, Cadence, Intel, Marvell, Mellanox, and Synopsys.
In another project, OCP aims to extend the processor root-of-trust created with its Project Cerberus to all devices in a server. That requires a new protocol and IP block being defined by a group including Facebook, Intel, and Microsoft.
The approach makes the NXP controller used on today’s Cerberus motherboards a master, talking with slave blocks in every peripheral chip. An Intel spokesman said that the group could make that peripheral block open-source.
Rethinking the server motherboard
A Microsoft engineer described his project to break the server motherboard into modules to lower costs and speed design times. Project leader Siamak Tavallaei (below) has already released a high-level description of his concept and enlisted a dozen companies interested in designing a prototype by this summer.
The effort makes a processor and its memory into one module that can be designed as soon as the chip is defined. A variety of CPU blocks can use a secure controller module that runs firmware, monitors temperature, controls fans, and other basic chores.
An I/O cable, currently based on PCIe Gen 4, will help reduce board space and close distances between processors and I/O. The shorter reaches will enable up to 60% savings on PCB materials and make room in a chassis for more ports, PCIe slots, and even accelerators such as GPUs.
*Microsoft’s Siamak Tavallaei described a new concept for a modular motherboard. *
Running a cool bath for overheating processors
Alternative cooling systems are all the rage as processors and accelerators get bigger and hotter.
“There are all kinds of funky ideas on the show floor this year;” said Microsoft’s Vaid. “By next year, it would be good if OCP could say, ‘For this wattage, we need X, and for that wattage, we prefer this other cooling system.’”
The group’s cooling committee has been at work for only a few months, so next year’s event may be an aggressive target for a standard. Meanwhile, attendees saw a wide variety of heat pipes, pumps, and more exotic cooling techniques.
Taiwan’s WiWynn showed a two-phase (liquid to vapor) immersion system cooling 100 nodes of a 48-V Facebook Diablo Pass server.
Priming the pump in racks and pizza boxes
One vendor estimated that as many as a dozen liquid-cooling offerings are currently available in addition to some homegrown solutions in the works by some web giants. Even immersion systems now have as many as eight competitors showing single- and dual-phase systems.
One immersion vendor, Submer, said that it has four megawatts worth of systems currently in pilots and expects to announce its first deployment with a 10-MW deal within days.
At LinkedIn’s booth, Zutacore showed solutions ranging from pipes for 1U servers (above) to plumbing systems for racks and a heat-exchanger box that looked like a car radiator in a metal enclosure.
Emerald Pool packs eight accelerators in a box
Facebook is preparing the way for a flood of accelerators expected over the next year.
For example, it is working with Broadcom and Verisilicon on an ASIC for a video transcoder that can handle everything from jerky uploads from handsets to the next series on Facebook Watch. It will be compatible with multiple codecs, including H.264, VP9, and AV1.
The chip needs to handle two 4K streams at 60 frames/second at 10 W and encode multiple streams in parallel. It should also support ffmpeg and VAPI standards, said Vijay Rao, director of technology strategy at Facebook.
For AI inference, Facebook wants chips capable of at least 5 TOPS/W. It is working with Esperanto, Habana, Intel, Marvell, and Qualcomm on its open-source Glow compiler for inference.
Facebook’s Emerald Pool is a mechanical and electrical design for a server packing up to eight accelerators, currently on a PCIe Gen 3 bus.
Arm still wrestles with data center servers
Microsoft is adding AMD Naples servers to the x86 lineup in its data centers, but so far, it has not been able to get Arm servers into production. The last wrinkle is smoothing out the many dependencies in the complex cloud software stack, but Vaid is hopeful that the work could be done in less than a year.
Marvell’s ThunderX2 is its only current candidate chip after Qualcomm pulled the plug on its Centriq. However, Microsoft is expected to test the new Ampere chip once it becomes available.
Huawei showed the dual-socket Arm server that it announced in January and is now sampling with 64 custom cores per socket.
Microsoft aims to streamline SSD controllers
In storage, Facebook and Microsoft are testing Intel’s Optane but staying mum about the results so far. Microsoft showed a 256-TByte 1U flash array consuming 400 W that it aims to put in production next month using 32 of the 16-TByte 3D NAND cards that Intel defined in a so-called ruler form factor.
Microsoft’s Vaid showed a Project Denali board that pushes most firmware jobs to the server, shrinking SSD controllers to simpler chips that only manage the NAND media, saving money and simplifying management.
In networking, Mediatek’s Nephos division showed 10 design wins for its 6.4-Tbits/s switch chip, some now running in China and U.S. data centers. It has taped out a 12.8-Tbits/s multi-die device using TSMC’s 7-nm process and its InFO packaging.
Rival startup Innovium said that it is in production with its 12.8T chip, which has design wins in two Cisco switches ramping this year. The leader in the field, Broadcom, is shipping its 12.8T Tomhawk-3 but is not believed to have taped out a 7-nm chip yet.
Meanwhile, Nokia is leading an OCP project for a standard enclosure for telco edge networks. Board and mechanical designs have been made open-source for the Open Edge effort that aims to serve deployments across a wide range of conditions.
Facebook showed its latest switch design, Minipack, sporting the Tomahawk-3 chip and gearbox devices from Broadcom serving a wealth of 25G optical ports. It also announced a new data center topology collapsing its four layers to three to save cost and reduce hops.