What was on Display at Nvidia’s GTC Event?

Article By : Rick Merritt

Nvidia’s annual graphics event attracted some 8,000 attendees, but one expected guest couldn’t make it — a 7nm GPU.

A nearly three-hour keynote featured new systems and software for the company’s latest processors, announced last August. Ironically, the most interesting news nuggets were Nvidia’s cheapest board to date and a research project on optical interconnects.

“The length of the event was inversely proportional to the content,” quipped one analyst.

The unspoken message for a pack of rivals aiming to build deep-learning accelerators was clear. Nvidia doesn’t need to pre-announce a new and faster chip because it owns that software stack and channel today.

Indeed, one datacenter manager said only one rival is sampling a working chip for AI training—the startup unicorn Graphcore. But it faces significant work adapting it to a software stack that’s been running on Nvidia GPUs for several years.

Accentuating the point, Nvidia packaged its many libraries under one new umbrella—CUDA-X, with versions for graphics, AI and more. It also described new use cases for its chips—a cross-tool environment for offline rendering called Omniverse and an expansion of its GeForce Now online gaming service

It packaged 40 of its latest graphics chips in an 8U RTX graphics server. For demanding data centers it ganged a pod of 32 of them into 10 racks with 1,280 GPUs linked via Infiniband.

“Data center graphics needs a whole new architecture,” said chief executive Jensen Huang, noting the company is working up more use cases for them.

Data Science Workstation
A GPU desktop configured for a data scientist.

To bolster its claim on the hearts and minds of AI developers, it configured a workstation specifically for data scientists. The system uses two Quadro RTX-8000 GPUs with a 96 GByte frame buffer and installed deep-learning software.

“Data science is the new challenge in high-performance computing,” said Huang, adding that last year Nvidia trained 100,000 of the 3 million data scientists working today.

Meanwhile it enlisted top server makers to build and sell T4 servers including Cisco, Dell, HPE, Inspur, Lenovo, and Sugon. The systems are scaled-back versions of Nvidia’s DGX-2, aimed at mainstream business users kicking the tires of deep learning and data analytics. They use up to four Turing-class T4 GPUs and 64 GBytes GDDR6 memory.

In the cloud, Amazon joined Google and Baidu in announcing plans for a service based on Turing chips. Alibaba is expected to follow suit.

The network is the computer, and the end of copper

The Turing family, extended in January with midrange products, now starts as low as $219 for versions with shaders but not ray-tracing support or deep-learning cores.

Taking a big step down market, Huang announced Jetson Nano, a $99 version of its robotics board for an audience of makers and academics. It consumes as little as 5W, but still sports 4 GBytes memory and runs Linux.

Long term, “the future datacenter will change as networking and compute become one fabric — and that’s why we are buying Mellanox,” said Huang, of the deal not expected to close until the end of the year.

“The program-centric data center is becoming the data-centric data center where data will create programs,” said Eyal Waldman, the chief executive of Mellanox, invited to briefly share the stage with Huang.

The marriage makes sense given that startup rivals, such as Wave Computing, Cerebras and others, are already building products that span multiple boxes linked by proprietary interconnects. As Moore’s Law slows, a single chip — even a single system — can’t deliver the performance that expanding workloads require.

Photonic DGX
More automated manufacturing and better power efficiency of optical components
may spell the end of board traces for high-end processors.

Presaging that future, Nvidia’s chief scientist, Bill Dally, told reporters about a research project in optical chip-to-chip links. It targets throughput in terabits/second while drawing 2 picojoules/bit/s. In an initial implementation, 32 wavelengths will run at 12.5 Gbits/s each, with a move to 64 wavelengths doubling bandwidth in a follow-up generation.

Dally predicted copper links will start to run out of gas as data rates approach 100 Gbits/second, already on many roadmaps for network switches. Progress in more power efficient laser sources and ring resonators will enable the long-predicted shift, he said.

If the future evolves as he believes, bleeding-edge GPUs may continue to skip an appearance at some Nvidia events. Attendees will have to hope that as the interconnects speed up, the keynotes don’t get even longer.

Subscribe to Newsletter

Leave a comment