Intel Ties CPUs, FPGAs, and Memories into a Package Deal

Article By : Rick Merritt

Customers express concerns that some advances in new Xeon CPUs, Agilex FPGAs, and Optane DIMMs come at a cost of Intel proprietary lock-ins.

HILLSBORO, Ore. — Intel announced a basketful of Xeon processors, Agilex FPGAs, and Optane DIMMs to power next-generation servers and network gear. The components’ use of proprietary interconnects is raising concerns among some of its largest customers given that Intel’s CPUs dominate in servers.

Strategically, Intel has tied its CPUs, FPGAs, and memories into a package deal, claiming performance benefits. It’s part of an industry trend to harness several chips into a dogsled that drives systems performance forward given the slowing rate of performance gains in individual chips.

Intel will supply a chiplet implementing a cache-coherent processor bus to link its Agilex FPGAs to its Xeon CPUs. Both the Xeon and Agilex FPGAs link to Intel’s Optane memories via its proprietary DDR-T protocol. Such vendor lock-ins “generally haven’t succeeded in the past with Intel or other companies,” said Linley Gwennap of the Linley Group.

Last month, Intel announced plans for CXL, an open processor bus that it will implement starting in 2021. An Optane executive here said that the group also “will have to make a hard decision at some point about opening up” the DDR-T interface.

“There will be lots of new memories with new price and performance points — that’s great, but no one wants to be locked into one protocol,” said an engineering manager from a top data center operator.

“Customers want to see openness,” said the engineer, who asked not to be named. “All companies think they have an edge keeping something proprietary, but they are hurting themselves by not driving things faster.”

Three interfaces, now? Four?

“The industry is confused about interfaces such as CCIX, GenZ, CXL, and I think there is a fourth,” he added. “Which one do I bet on? Will my software work carry forward? That confusion is holding people back. So even if there are performance gains, the industry is not doing itself a favor by fragmenting standards.”

The proprietary interfaces are a “high concern,” said a lead engineer at another large data center operator who asked not to be named.

“Historically, we were in a lock-down situation that resulted in very poor generation-over-generation improvements,” the second engineer said. “Enabling such a lock-down will just make the situation worse as now you lock memory with the CPU and the accelerators. I think this is counter-intuitive … and will not be successful.”

“Optane DIMMs have value for some applications that need very large memory arrays at a more reasonable price and are willing to compromise on performance,” he added. “But again, there is the issue of Intel being the only source for that technology, which is problematic any way you look at it.”

Users “don’t want to go from depending on Intel for processors to depending on them for memory, too,” said Jim Handy, a memory market watcher at Objective Analysis.

In addition, new interconnects, especially cache-coherent ones, can require rewriting and tuning software, a costly process. They can also bring greater power consumption, larger packet sizes, and higher costs. Customers will likely test the latest options only if there are demos of apps showing potential for at least 2× performance gains, the first engineer said.

Cascade Lake
Intel rolled out more than 40 versions of Cascade Lake, the half in the higher tiers enabled for its Optane memories. (Source: Intel)

A 27-W SoC to a 400-W liquid-cooled monster

Intel detailed more than 40 versions of its 14-nm-plus second-generation scalable Xeon, aka Cascade Lake, offering a 33% average performance boost. It also announced Agilex, a family of 10-nm FPGAs using its expanding chiplet technique to forge links to Xeon CPUs, 112G SerDes, and blocks that customers can design using the flow it acquired with eASIC.

In addition, Intel provided the first design and performance details of its Optane DIMMs and their proprietary DDR-T protocol. The disclosures provided a glimpse into the difficulty of creating a new memory tier, a cautionary tale for rivals, such as Micron and Samsung, likely to follow a similar path in the future.

Tactically, the most significant piece of the Xeon upgrade was the addition of four cores to many mainstream CPUs without increasing their prices, said analyst Gwennap.

“Intel used this release to adjust its pricing structure responding to competition from AMD and Arm … basically at the same price, as the previous-generation users are getting four cores for free — that’s a huge improvement,” he said.

In an effort to push its high-end Xeon forward, Intel put two 56-core die in a package and as many as four of the resulting chips in a new reference design. Driving the boards to the hilt, it designed some to run at up to 400 W with liquid cooling, a technique coming back in fashion as CMOS scaling slows.

“To me, it’s a stunt that doesn’t change the equation in performance per watt or per dollar … it offers twice the performance at twice the cost and power, so it’s not clear what the point of that is,” said Gwennap.

The design was a response to “not having a lot of levers to pull” in the latest Xeon generation, he added. That’s because Cascade Lake has no new microarchitecture, only a tweak of the 14-nm-plus process that the prior Skylake used, and the new CPUs plug into its same overall server platform as Skylake.

The 9200-series chips beat AMD’s last-generation Epyc CPU handily. However, it will compete later this year with the 7-nm Rome version of Epyc that will pack up to 64 cores.

Intel took longer than expected to field Cascade Lake, leading Gwennap to speculate that its first 10-nm Xeon — Ice Lake — may not debut until late 2020. Intel is expected to field a system upgrade called Cooper Lake, perhaps before the end of the year, that could help it catch up with the extra memory and I/O channels that AMD offers with Epyc.

The good news is that Intel is covering the waterfront with versions of Cascade Lake optimized for specific workloads such as networking and virtualization. They start at 70-W chips with eight cores running at a base 2.2-GHz frequency, some supporting extended temperature ranges.

The chips are also the first to harden defenses against side-channel attacks such as Spectre and Meltdown, generally without performance degradation.

To drive the x86 deeper into networking, Intel took a step backwards with its new Xeon-D 1600, shipping by June. The 27- to 65-W SoC uses two to eight older Broadwell cores to increase performance per watt combined with four 10-Gbit Ethernet controllers to power base stations, routers, and switches.

Xeon chips
To keep a performance lead, Intel packed up to four devices with two high-end Xeon chips each in reference designs that it runs up to 400 W with liquid cooling. (Source: Intel)

Intel FPGA chiplets vie with Xilinx Versal blocks

Agilex was one of the most interesting announcements of the batch. The 10-nm FPGA is the first designed entirely since Intel acquired the former Altera.

Intel will supply chiplets for 112G SerDes, PCIe Gen5, and a cache-coherent processor bus that attaches to Agilex using Intel’s embedded multi-die interconnect bus (EMIB). It acquired eASIC last July so that customers could use its flow to harden their own EMIB blocks.

The industry is showing renewed interest in chiplets as a way to lower the cost of making high-end chips as Moore’s Law slows. Last week, an initiative to create an open standard for chiplets held its first workshop.

Last summer, Intel released as open-source the AIB protocol for its EMIB package as part of its work in a DARPA research program on chiplets. However, the company expects to build its chiplet ecosystem more from the work of its customers than third parties.

“There’s a lot of interest from third parties and we have some engagements, but the model is more at the customer level,” said Dan McNamara, general manager of Intel’s programmable logic division. “They like the eASIC flow, but it’s still early in the development of the ecosystem.”

“We’re not averse to adopting anything else, but with our work in the DARPA program and AIB, we feel we’re ahead of the curve,” added McNamara, who spearheaded the idea of Intel’s eASIC acquisition.

Agilex will also sport links to DDR5 memory, PCIe Gen 5, and Intel’s proprietary Optane DIMMs. Microsoft’s cloud service already uses Intel’s FPGAs flexibly as accelerators for a variety of jobs including compression, encryption, and search. It’s unclear what other use cases Intel will find for coherently linked pools of Xeon, Agilex, and Optane chips.

“We’re doing a bunch of things with the [Intel] memory group that we’re very excited about, but we’re not talking about them yet,” McNamara said.

Customers can use tools from the former eASIC to create chiplets that link to Intel’s next-gen FPGAs via Intel’s proprietary EMIB interconnect. (Source: Intel)

Intel claims that Agilex will deliver 40% more performance or consume 40% less power than its prior Stratix FPGAs. Interestingly, the new chips are expected to sample in the fall — about the same time as Intel’s first 10-nm PC processors, with volumes presumably in early 2020.

Initially, the chips will be programmed with the group’s traditional Quartus tools. But Intel aims to roll out later this year plans for OneAPI, a high-level abstraction layer and set of libraries that will serve its FPGAs, CPUs, and upcoming GPU.

Agilex comes in the wake of Versal, a hybrid FPGA/SoC sold as chips or boards, that FPGA rival Xilinx rolled out in October.

“Xilinx is taking a very different strategy from Intel, supporting a wide variety of hard logic blocks on die … Xilinx is pushing Versal as a processor with some programmable gates, while Intel is taking a more traditional approach of an FPGA linked to its processor,” said Gwennap.

Subscribe to Newsletter

Leave a comment