PARISWave Computing has set its sights on becoming the first AI startup to develop and deploy a 7-nm AI processor in its AI systems.

EE Times has learned that Wave has snagged Broadcom Inc. as an ASIC designer for the new 7-nm project. The two companies will collaborate on development of Wave’s next-generation Dataflow Processing Unit (DPU) by using Taiwan Semiconductor Manufacturing Co.’s 7-nm process node.

Derek Meyer
Derek Meyer

The new 7-nm DPU — scheduled for delivery by Broadcom at an undisclosed date — will be “designed into our own AI system,” confirmed Wave’s CEO, Derek Meyer. He added that the same chip may become available to others “if there is a market demand.”

“Wave is hoping to get a jump on the startup competition with a 7-nm part,” observed Kevin Krewell, principal analyst at Tirias Research. “Most startups don’t have the expertise to build a 7-nm part just yet.” He explained that Broadcom’s involvement made this possible. Broadcom, he noted, “does have more senior ASIC circuit design experience through the acquisition of LSI Logic.”

Wave’s current-generation DPU is based on a 16-nm process design.

“Among our peers who are designing a new breed of AI accelerators, we will be the first to have access to 7-nm physical IP — such as 56-Gbps and 112-Gbps SerDes — thanks to Broadcom,” noted Meyer. Broadcom is “instrumental to bringing this 7-nm project to fruition,” he explained, thanks to “their industry-leading design platform, productization skills, and proven 7-nm IPs.”

Wave’s current-generation DPU based on 16-nm process node was designed by Wave’s employees with the help of contractors. As for the 7-nm DPU, Meyer said, “Between Broadcom and Wave, we have sketched out skills and resources that will be necessary to both front-end and back-end [of the ASIC] designs. We devised our plans for collaboration accordingly.”

The joint 7-nm project has been up and running for several months. Broadcom will manage physical delivery of the 7-nm chip. Despite the complexity of 7-nm designs, Meyer said, “I am confident that Broadcom will deliver the first-time right chip.” Wave, however, declined to comment on when its 7-nm DPU will become available.

What’s in the 7-nm DPU?
Wave did not reveal the architecture of its 7-nm DPU, either.

Meyer, however, explained that the new chip will be “based on the data flow architecture.” It will be the first DPU featuring “64-bit MIPS multithreaded CPU.” Wave acquired MIPS in June.

Meyer also indicated that Wave’s 7-nm chip will come with “new features in memory,” but he refrained from disclosing what exactly those features are.

MIPS’s multithreading technology will play a key role in the new-generation DPU, according to Meyer. In Wave’s dataflow processing, “when we load, unload, and reload data for machine-learning agents, hardware multithreading architecture is effective.” MIPS’s cache coherence is another positive for Wave’s new DPU. “Because our DPU is 64-bit, it only makes sense that both MIPS and DPU talk to the same memory in 64-bit address space,” he said.

Asked about Wave’s new features in memory, Krewell said, “Wave’s present chip uses Micron’s Hybrid Memory Cube. And I believe Wave will move to high-bandwidth memory (HBM) in future chips.” He added, “There’s a much better roadmap for HBM. The changing memory architecture will have an impact on the overall system architecture.”

Karl Freund, senior analyst at Moor Insights & Strategy, concurred. He said, “For memory, I suspect they will abandon the Hybrid Memory Cube and adopt high-bandwidth memory, which is more cost-effective.”

During the interview, Meyer boasted that the new 7-nm DPU should be able to offer 10 times the performance of the company’s current chip.

“Remember, we separated the clocks from our chips” in the DPU architecture, he said. Noting that going back and forth to a host creates a bottleneck, he explained that in DPU, an embedded microcontroller loads instructions, cutting down on power and latency wasted by traditional accelerators. “We can take advantage of that capacity available for transistors on the 7-nm chip to increase the performance.”

Krewell remained a little skeptical. “As to whether Wave can make a 10x leap, that’s a long reach.” He said, “It depends on how machine-learning performance is measured … and whether Derek [Meyer] was talking training or inference.” He added, “There are a lot of changes going on in inference, with lower-precision (8-bit and below) algorithms being deployed. Training performance is heavily memory-architecture- dependent.”

He acknowledged, “But I don’t know the details of what Wave has planned.”

— Junko Yoshida, Chief International Correspondent, EE Times Circle me on Google+