Compute and Memory Need to get Cosy

Article By : Gary Hilson

There's been much talk about which memories might best support AI and ML. But just as important are the architectures, regardless of whether the workload is done in the cloud or at the edge.

TORONTO — Big data applications have already driven the need for architectures that put memory closer to compute resources, but artificial intelligence (AI) and machine learning are further demonstrating how hardware and hardware architectures play a critical role in successful deployments. A key question, however, is where is the memory going to reside?

Research commissioned by Micron Technology found that 89% of respondents say it is important or critical that compute and memory are architecturally close together. The survey, carried out by Forrester Research, also found memory and storage are the most commonly cited concerns regarding hardware constraints limiting AI and machine learning today. More than 75% of respondents recognize a need to upgrade or rearchitect their memory and storage to limit architectural constraints.

AI compounds the challenges already unearthed by big data and analytics requirements because machine learning does multiple accumulation operations on a vast matrix of data over neural networks. These operations are repeated over and over as more results come in to produce an algorithm that is the best path and the best choice each time — it learns from the working on the data.

Because there’s so much data, said Colm Lysaght, Micron’s vice president of corporate strategy, a common solution for getting the necessary working memory is simply to add more DRAM. This is shifting the performance bottleneck from raw compute to where the data is. “Memory and storage is where the data is living,” he said. “We have to get it to a CPU and then back again, over and over again, as these vast data sets are being worked on.”

Finding ways to bring compute and memory closer together means saving power, because data isn’t being shuttled around as much, Lysaght said. “It's increasing performance because more things can happen right where they need to happen,” he said.

Micron Memory
Micron sees existing memory and storage technologies such as DRAM and 3D NAND SSDs providing the hardware for AI architectures, but is also researching newer technologies such as processor-on-memory architectures while supporting pioneering startups.

There are a number of different approaches to creating better architectures, Lysaght said. One example is neuromorphic processors that use a neural network internally and break up the internal number of cores into a larger number of smaller cores. “Because there is a large matrix of data that's being worked on, having many more cores doing relatively simple operations over and over again is a better solution,”Lysaght said.

One memory company that’s interested in developing new architectures is Crossbar Inc. Along with Gyrfalcon Technology Inc., mtes Neural Networks Co. (mtesNN) and RoboSensing Inc., it formed SCAiLE (SCalable AI for Learning at the Edge), an AI consortium dedicated to delivering an accelerated, power-saving AI platform.

Sylvain Dubois, vice president of strategic marketing and business development at Crossbar, said the group will combine advanced acceleration hardware, resistive RAM (ReRAM) and optimized neural networks to create ready-made, power-efficient solutions with unsupervised learning and event recognition capability.

Dubois said the challenge for many companies is that they want AI on a device but have no idea how to do it, whether it’s a smart speaker, smart camera or smart TV. The goal of the consortium is to provide a platform that brings all the necessary pieces together.

Crossbar’s contribution is around the memory — specifically ReRAM — that will help process the data that comes into a machine learning system via a wide range of inputs, including text, keywords, GPS coordinates and visual data from sensors — all of it unstructured.

Dubois envisions a memory array architected to be read by specific processing codes for each of these instances in a very wide and highly parallel way, where a thousand beats are read in parallel in an edge device.

“If you're a match, then you know at the edge what to do," Dubois said. "But if you don't have a match, then this is what we call the learning curvature.”

For example, in the case of a camera sensor, he said, the system will be able to store new events or a set of features in a spare location of the ReRAM array. “The next time you have a similar event passing in front of this camera, the camera itself will be able to detect it without any training in the cloud,” Dubois said.

More Analytics in the Cloud

This presents a totally different way of doing AI because it’s not dependent on the massive training capacity in the cloud if an unexpected event occurs that needs quick decision making, such as a traffic scenario where safety is a concern, Dubois said.

Enabling more machine learning at the edge jives with the Forrester Research study, which projects that more firms will be running analytics in public clouds and at the edge. Fifty-one percent of respondents said they are running analytics in public clouds, a number that is forecast to increase to 61% in the next three years. In addition, while 44% run analytics at the edge today, Forrester predicts that will grow to 53% by 2021.

Chris Gardner, Forrester’s senior analyst serving infrastructure and operations professionals, prepared the report for Micron and was initially surprised by how much the hardware stuff “bubbled up,” specifically storage and memory.

“I expected to see more around software programmability issues around the hardware, and governance issues. And certainly that did pop up, but not to the extent that these other things did," Gardner said.
What also sprung forth from the research is how a tremendous amount of work is being done in memory itself while avoiding storage if possible, said Gardner. But what’s interesting to note is that the need for memory and storage depends on what you’re doing. According to Gardner, training the model requires a tremendous amount of memory and storage. Otherwise, he said, you need barely anything at all.

Crossbar Memory
Crossbar, which recently formed a consortium to create AI platforms, offers memory products targeted for AI applications, such as its P-Series MCU with embedded ReRAM.

In a perfect world, Gardner said, companies would love to have a massive environment with hundreds of gigabytes if not terabytes of RAM sitting there ready to go. But in reality, they would have to build it or pay a provider to do it for them, he said, adding that what is needed is a paradigm shift, hardware-wise.

“We need more memory-centric architecture,” Gardner said. The compute needs to surround the memory, and to a lesser degree the storage, rather than compute being at the center, he added.

“It’s not that the compute architecture and the way we're approaching it is bad today, but it may not be the most efficient way of doing AI and machine learning,” Gardner said.

Gardner also covers edge computing for Forrester. One scenario that comes up is a stadium that hosts major sporting leagues populated with cameras, producing a tremendous amount of data that needs to be processed quickly to figure out if there's a dangerous situation. “They could send that to the cloud and back, but they don't have the time to do that,” he said. “They have to process it as quickly as possible.”

There will continue be some machine learning that’s done in the cloud and sent back out to Internet of Things (IoT) devices, but some of those devices will be increasingly intelligent and do their own machine learning that can be shared back to the cloud and then to other devices. For memory makers, this means a continued transformation from commodity component makers, said Gardner, as well as recompiling applications to take advantage of a memory-centric architecture necessary for AI and machine learning workloads.

But we’re still in an experimentation phase right now, as there isn’t any real tensor flow that you can put together that's using a memory-centric architecture that also has great latency outside of experimentation.

“We've been running with a CPU kind of mindset for decades now,” Gardner said. “The idea that we're going to get out of that mindset is pretty revolutionary.”

Last fall, Micron Technology announced a $100 million venture fund with a focus on AI. The company has a DRAM-like product in the lab that it aims to sample in 2021, while its researchers are looking at processor-on-memory architectures also being explored by startups.

Virtual Event - PowerUP Asia 2024 is coming (May 21-23, 2024)

Power Semiconductor Innovations Toward Green Goals, Decarbonization and Sustainability

Day 1: GaN and SiC Semiconductors

Day 2: Power Semiconductors in Low- and High-Power Applications

Day 3: Power Semiconductor Packaging Technologies and Renewable Energy

Register to watch 30+ conference speeches and visit booths, download technical whitepapers.

Subscribe to Newsletter

Leave a comment