Nvidia DPU Eyes NICs in Next-Gen Data Center

Article By : Sally Ward-Foxton

At its GTC Conference this week, Nvidia CEO Jensen Huang will unveil everything from its lowest-priced dev kit to a new supercomputer...

Nvidia announced a new type of processor, the data processing unit (DPU), essentially a network interface card (NIC) with built-in Arm CPU cores to offload and accelerate networking, storage and security tasks which would previously have been done on another CPU. The DPU will eventually replace the NIC in data center systems.

Nvidia’s GTC Conference always brings a flurry of futuristic announcements from the AI behemoth. This autumn’s event, which will host 30,000 attendees virtually, is no exception with CEO Jensen Huang expected to announce everything from a new type of data center processor to its lowest priced AI development kit yet, and from video conferencing AI that makes it look like you’re paying attention to a brand new supercomputer for the healthcare industry.

Data center DPU
In a press pre-briefing ahead of Jensen Huang’s keynote, Manuvir Das, head of enterprise computing at Nvidia said that data centers have been experiencing a shift over the last decade, whereby a number of functions that were performed in hardware, such as network security, have become software-defined and have moved within the application server itself, typically inside the hypervisor.

“As NICs have advanced, you will also find acceleration engines for different kinds of I/O activities, such as RDMA embedded there as well,” he said. “As this has happened, it’s created a greater load on the x86 CPU hosts of the server, leaving less room to run the applications.”

The Bluefield-2 DPU comes on a PCIe card, and combines a Mellanox ConnectX-6 Dx SmartNIC with 8x 64-bit Arm A72 cores and 2x VLIW acceleration engines on the same silicon. A single Bluefield-2 DPU can deliver the same data center services that could consume up to 125 CPU cores. This frees up valuable CPU cores to run a wide range of other enterprise applications, said Das.

Nvidia Bluefield-2X card announced at GTC Conference
Nvidia’s Bluefield-2X card, which includes the DPU and Ampere accelerator GPU (Image: Nvidia)

Nvidia also announced the Bluefield-2X, which adds an Ampere family AI accelerator GPU to the same card as the Bluefield 2. This adds 60 TOPS of AI acceleration which can be used to do intelligent analysis of what is going on in the network. For example, it could be used for intrusion detection, where AI can tell the difference between normal and abnormal behavior so that anything abnormal can be proactively identified and blocked.

Huang’s keynote also presented the roadmap for the Bluefield family. Bluefield 3 will have a more powerful Arm CPU and more powerful acceleration engines. The 2X and 3X have Ampere GPUs on the same card which offer 60 and 75 TOPS of AI acceleration, respectively. The Bluefield 4, expected circa 2023, will offer a whopping 400 TOPS of GPU acceleration integrated into the same silicon as the DPU.

While Arm processors feature heavily in the DPU, development of this product started several years ago at Mellanox, and any link with the recent news that Nvidia is to acquire ARM is entirely coincidental, said Das.

“It’s not just about moving the workload, it’s about accelerating it,” he said. “That’s a fundamental tenet of our DPU. We use Arm cores because we believe that in this context, they are the best CPUs to have on the device.”

Edge AI gets edgier
For the world of edge AI, Nvidia announced a new entry-level developer kit, the Jetson Nano 2GB. This kit will retail at $59 and is designed for teaching and learning AI. It will be available at the end of this month.

Nvidia Jetson Nano dev kit from GTC Conference
The new Jetson Nano dev kit, which will retail at $59 (Image: Nvidia)

While the first four of five years of the Jetson program focused on professionals and enthusiasts, the launch of the Jetson Nano 18 months ago democratized AI, bringing it to a much wider audience, said Deepu Talla, vice president and general manager of edge computing at Nvidia.

“In the first five years of the Jetson journey, we started from zero and went up to 200,000 developers,” he said. “In the last 18 months, we’ve added another 500,000 developers, and we also observed that the activity of these developers has increased 10 times… I think we are at a seminal point right now where every software engineer in the world wants to reskill themselves to become an AI engineer. Every university student in engineering, whether it’s electrical engineering or computer science wants to jump in re-skill themselves with AI. And we’re also starting to see high school students, especially in the STEM area, wanting to learn AI.”

For students and AI beginners, Nvidia will provide a free training and certification program alongside the Jetson community of projects, how-tos and videos contributed by developers. At the end of the course, candidates must submit an AI project which is graded: a passing grade earns certification from the Nvidia Deep Learning Institute.

Futuristic communication
Nvidia used GTC to announce an open Beta for its previously announced Omniverse platform, which Huang called “the beginning of the Star Trek Holodeck, realized at last.”

Omniverse is a 3D simulation and collaboration platform that allows remote teams to collaborate on projects that rely on 3D graphics, including architecture, engineering, animation and more. The open Beta follows a year-long early access program in which 40 companies provided feedback to Nvidia.

Omniverse uses Pixar’s widely adopted Universal Scene Description (USD) format for interchange between 3D applications, plus Nvidia real-time photorealistic rendering, physics, materials and interactive workflows between industry-leading 3D software products.

Nvidia also announced a new suite of cloud-based AI technologies for video call providers. Called Nvidia Maxine, the suite uses AI to correct the gaze of participants so it looks like they are always looking into the camera, creates super-resolution video from lower quality streams, artificially adjusts lighting, and more.

“A lot of us as we’re on these calls all day have multiple windows open,” said Richard Kerris, general manager of media and entertainment at Nvidia. “We’re looking at different things, and not really making the eye contact that you want to make with the person that you’re talking to. Using AI we can actually reconstruct that face and ensure that the eye contact is taking place so that you have a more personalized experience.”

Maxine can also reduce the bandwidth needed for video calls. Instead of simply streaming the whole video, Maxine’s AI can determine facial movements of the person speaking, and send only that data over the internet. It then reanimates the person’s face from the data at the other end. This tech can reduce video bandwidth to a tenth of what’s required for H.264 streaming today.

Nvidia Maxine GTC Conference
Nvidia’s Maxine platform for video call providers has a range of features, from “faking” the person’s pose to make it look like they are paying attention to reducing bandwidth required by a factor of 10 (Image: Nvidia)

Other features for video call providers include use of Jarvis, Nvidia’s conversational AI, to translate between languages in real-time, so you can video conference with a person speaking another language.

Early access to Maxine starts today and Kerris said Nvidia already has video call providers excited to roll this technology out in the next few months.

Healthcare innovation
Nvidia also had several important announcements in the field of healthcare. The company launched Clara Discovery, a set of tools which includes pre-trained AI models and application-specific frameworks for computational drug discovery.

“Drug discovery is the grand challenge of our time, and there’s no more important time for us to help improve the process; it’s still very complex,” said Kimberly Powell, vice president and general manager for healthcare at Nvidia.

Analysis of genomic sequencing data, understanding protein structure, selecting chemical compounds to investigate and simulating how they behave are compute-intensive parts of the drug discovery process. Clara Discovery uses imaging, radiology and genomics data to develop AI applications to accelerate this process. Clara also includes Bio-Megatron, a newly-developed language model developed specifically for biomedical texts and papers. Bio-Megatron will be implemented for research purposes such as literature searches, and to interpret unstructured clinical notes from doctors.

Federated learning Nvidia
Federated learning, employed in a new project by Nvidia and Massachusetts General Brigham Hospital, allows training of a medical AI model for Covid-19 treatment while ensuring patient data does not leave the premises (Image: Nvidia)

Nvidia has partnered with Massachusetts General Brigham Hospital to develop an AI model that determines whether a person showing up in the emergency room with Covid-19 symptoms will need supplemental oxygen hours or even days after an initial examination. The model, Corisk, combines medical imaging with other patient health data to predict what oxygen level the patient will need.

Using a federated learning technique developed with King’s College London 18 months ago, called privacy preserving federated learning, data from 20 different hospitals around the world was used to train the Corisk model. The federated learning technique used shares a global model to hospitals, which they train locally using local data. They then share back partial model weights which update the global model. Patient data does not leave the hospital. Importantly, this technique means the model can be trained using data from diverse patient groups, using different types of imaging equipment, without compromising privacy. The resulting model achieved 0.94 area under the curve (AUC) in only 2 weeks and will be deployed as part of Clara in the coming weeks.

Nvidia is teaming up with pharmaceutical giant GSK in the UK, where GSK has built the industry’s first AI lab dedicated to drug and vaccine discovery. Both GSK and Nvidia data scientists will work in the London lab, which has already invested in a number of Nvidia DGX-A100 systems.

Nvidia announced it will invest £40m (about $51m) in a brand new AI supercomputer in the UK. Cambridge-1 will be the UK’s most powerful supercomputer, clocking in at 400 PFLOPs or 8 PFLOPs Linpack performance. Based on eighty DGX-A100s, it will be in the top 30 of the TOP500 supercomputer rankings, and in the top 3 of the Green 500 list. It is scheduled to be up and running by the end of 2020.

Cambridge-1 will use eighty Nvidia DGX-A100 systems for a total of 400 PFLOPs of AI compute (Image: Nvidia)

This is a separate supercomputer to the one Nvidia announced it is building at Arm HQ in Cambridge as part of the Arm acquisition announcement (Cambridge-1 will be based on the DGX SuperPOD architecture, whose CPUs are in turn based on the x86 architecture).

Cambridge-1 will be used by the industry to solve large-scale healthcare and data science problems, as well as by universities and startups. Nvidia plans for Cambridge-1 to be part of the AI center of excellence it is building in Cambridge, which will eventually expand to include more supercomputers and more industries across the UK.

Subscribe to Newsletter

Leave a comment