Nvidia is leveraging on its open-source HGX GPU accelerator to drive AI cloud computing.
The big change from the PC era to accelerated computing is the emergence of open-sourcing.
For Intel to dominate the PC market, Microsoft needed to be there. But as Paul Teich pointed out, “Nvidia doesn't need a software partner in an open-source world. Open source had not been invented yet in the early 1980s, when IBM designed the PC, at least not in a modern sense.”
He added, “Nvidia already had a general purpose GPU programming environment for the high performance computing and graphics/movie rendering audiences, so adding the additional features for deep learning wasn't that big a deal.”
But here’s the thing. “It just required foresight, and Nvidia got in early. But with Google, AWS (Amazon Web Service), Microsoft, Baidu and the lot throwing new deep learning model frameworks over the wall at a quarterly cadence, Nvidia's job is to provide non-denominational developer tools that support all of the frameworks,” Teich explained.
Practically every market research firm and consultancy is bullish on the fast-paced growth of the AI market and its broad acceptance by many businesses.
Accenture, which describes AI as the new UI, claims, “Despite skepticism of AI as just another technology buzzword, its momentum is very real.” The management consulting firm stated in its recent report, “85 per cent of executives we surveyed report they will invest extensively in AI-related technologies over the next three years.”
Other market research firms’ predictions suggest all something similar. Forrester Research predicted a greater than 300 per cent increase in investment in artificial intelligence in 2017 compared with 2016. IDC estimated that the AI market will grow from $8 billion in 2016 to more than $47 billion in 2020.
But the number that hits closest to home for Nvidia is one of Gartner’s predictions. The market research firm said, “By 2018, deep learning (deep neural networks) will be a standard component in 80 per cent of data scientists’ tool boxes.” If true, Nvidia will be ready to plug its standard GPU accelerator, HGX-1, into every available socket.
HGX reference design
Nvidia described its HGX reference design as meeting “the high-performance, efficiency and massive scaling requirements unique to hyperscale cloud environments.”
But what’s under the hood of the HGX reference design?
Figure 2: Inside Nvidia's HGX reference design. (Source: Nvidia)
Because it’s designed in a highly configurable manner to meet workload needs, Nvidia explained that HGX can easily and flexibly combine GPUs and CPUs for high performance computing, deep learning training and deep learning inferencing.
The standard HGX design architecture includes eight Nvidia Tesla GPUs in the SXM2 form factor, connected in a cube mesh using Nvidia NVLink high-speed interconnects and optimised PCIe topologies. “With a modular design, HGX enclosures are suited for deployment in existing data center racks across the globe, using hyperscale CPU nodes as needed,” the company said.
Keith Morris noted an upgradable path for HGX, pointing out that both Nvidia Tesla P100 (based on Pascal) and V100 GPU (based on the newly launched Volta) accelerators are compatible with HGX. “This allows for immediate upgrades of all HGX-based products once V100 GPUs become available later this year,” the company said.
But surely, there must be companies other than Nvidia also designing a standard AI box to plug and play in other AI computing architecture. Who are they?
Teich said, “I have to imagine that every company designing a deep learning accelerator chip (for training, inference or both) will build an appliance so that potential buyers can put it through its paces.”
However, he added, “Convincing a bunch of high volume ODMs to tune a reference design for individual cloud giant preferences is a much deeper level of engagement. We're seeing that deeper engagement here.”