What’s the difference between edge AI and endpoint AI, and how much 'smart' should you put into an edge AI device? We talk to various vendors about edge intelligence to see if there is a consensus on the ‘right way’.
When new industry buzzwords or phrases come up, the challenge for people like us who write about the topic is figuring out what exactly a company means, especially when it uses the phrase to fit its own marketing objective. The latest one is edge artificial intelligence or edge AI.
Because of the proliferation of internet of things (IoT) and the ability to add a fair amount of compute power or processing to enable intelligence within those devices, the ‘edge’ can be quite wide, and could mean anything from the ‘edge of a gateway’ to an ‘endpoint’. So, we decided to find out if there was consensus in the industry on the definition of edge vs. endpoint, who would want to add edge AI, and how much ‘smartness’ you could add to the edge.
First of all, what is the difference between edge and endpoint? Well it depends on your viewpoint — anything not in the cloud could be defined as edge.
Definitions: edge can be many things, endpoint is literally the endpoint
Probably the clearest definition was from Wolfgang Furtner, Infineon Technologies’ senior principal for concept and system engineering. He said, “The term ‘edge AI’ inherits its vagueness from the term ‘edge’ itself. Some people call a car an edge device and others are using the term for a small energy harvesting sensor with low-power wireless connectivity. Edge is used in relative ways and distinguishes the more local from the more central. But indeed, there is a need to distinguish between the various kinds of things that you find at the edge. Sometimes you hear terms like edge-of-the-edge or leaf nodes being used. Edge AI can be many things including a compute server in a car.”
But the key he said, is that, “Endpoint AI resides at the location where the virtual world of the network hits the real world, where sensors and actuators are close.”
It’s all about semantics, and where you draw the boundary, according to Markus Levy, director of machine learning technologies, NXP Semiconductors. “Edge machine learning (ML) is the same as an ‘endpoint’ machine, except edge ML can also include ML that takes place in a gateway or even fog compute environment. Endpoint ML is typically related to distributed systems, for example where our customers are adding intelligence even down to the sensor level. Another example is a home automation system — where there are ‘satellite’ devices (such as thermostat, doorbell camera, security cameras, or other types of connected devices), and while these can independently perform machine learning functions, they might also feed into a gateway where more advanced ML processing occurs.”
From Arm’s perspective, Chris Bergey, its GM and VP of the infrastructure business, put it slightly differently, commenting on the increasing levels of intelligence in both edge servers and endpoints. He said, “Basic devices such as network bridges and switches have given way to powerful edge servers that add datacenter-level hardware into the gateway between endpoint and cloud. Those powerful new edge servers making their way into 5G base stations are plenty powerful enough to perform sophisticated AI processing — not only ML inference but training, too.”
And how is this different to endpoint AI? Bergey comments, “Due to their powerful internal hardware, smartphones have long been a fertile test-bed for endpoint AI. As the IoT intersects with AI advancements and the rollout of 5G, more on-device intelligence means that smaller, cost-sensitive devices can be smarter and more capable while benefiting from greater privacy and reliability due to less reliance on the cloud or internet. As this evolution of bringing more intelligence to endpoints continues, the boundaries of where exactly the intelligence takes place will also begin to blend from endpoint to edge, stressing the need for a heterogenous compute infrastructure.”
There are others for whom edge is just everything that’s not in the cloud. For example, Jeff Bier, founder of the Edge AI and Vision Alliance, said, “We define edge AI as any AI that is implemented — in whole or in part — outside the data center. The intelligence might be right next to the sensor, for example in a smart camera, or a bit further away such as an equipment closet in a grocery store, or even further away, such as in a in a cellular base station. Or some combination or variation of these.”
That’s a similar position taken by Xilinx. Nick Ni, the company’s director of product marketing for AI, software and ecosystem, said, “Edge AI is basically a self-sufficient intelligence deployed on the field without reliance on a data center. It is essential for applications that require real-time response, security — for example, not sending confidential data to the data center — and low-power consumption, which is most of the devices out there. Just like humans don’t rely on a data center to make countless decisions daily, edge AI will dominate the market in applications like semi-autonomous cars and smart-retail systems in coming years.”
And Andrew Grant, senior director for artificial intelligence at Imagination Technologies re-affirms this. “It’s all edge as far as we are concerned. It’s the customer who decides where it goes. We’ll see very much a hybrid approach and there’s absolutely a role for the cloud and data centers in this too.” He added, “The speed with which the market is moving [to the edge] is phenomenal. There’s been a wave of movement to the edge, but for many applications it takes time for the silicon to materialize.” Grant explained, “We were talking to a traffic management company in China who are moving data back and forth from the cloud. When I explained to them what we do, they immediately see the benefit of not having to take the data to the cloud if the traffic lights themselves can determine whether a car is moving or not.”
Embedded systems provider Adesto Technologies’ CTO, Gideon Intrater, said that they don’t necessarily differentiate between edge and endpoint, as the company provides devices for IoT edge servers as well as IoT edge devices. “While we don’t tend to use the word ‘endpoint’ in our own communications, perhaps definitionally ‘endpoint’ is aligned with the edge devices. AI in these devices would typically be some amount of local inference, with the algorithms running as a program on a processor, using a dedicated accelerator, through near-memory processing, or in-memory computing.”
He added, “AI at the edge is becoming a reality across just about every application. We see a great opportunity in industrial and building implementations where AI can provide benefits through predictive and preventive maintenance, quality control in manufacturing, and many other areas. The industry is just getting started, and every day that passes, we expect AI to do more for us. When our older devices without AI don’t intuitively understand our needs, we often get frustrated because we have other devices that will provide intuitive capability. The end consumer doesn’t know what goes into making an AI solution work; they just expect it to work.”
It’s early in the technology adoption cycle — so who would want it?
So we are clear on the definition: you either sit in the camp that says edge is everything that’s not in the cloud, or with others who clearly identify the endpoint as the meeting point of the physical world with the digital world, mostly the sensors. But the specific application will determine the point at which the intelligence might need to be added, with an increasingly blurred line between edge and endpoint and a somewhat heterogenous compute infrastructure.
The next question is who would want it, and what are the market expectations for edge AI? “This is something we’re all still figuring out,” explains NXP’s Levy. “The industry leaders are well engaged in implementing it; I can’t name names, but we have a wide range of customers doing all kinds of machine learning at the edge. However, if you look at the ‘technology adoption cycle’, I still believe the majority of the industry is not even at the ‘early adopter stage’, and this will really begin to unfold towards middle to late 2020.
“Customers are still comprehending the cool things that are possible with machine learning. But I typically give a few guidelines: 1) can it save money? For example, by making a factory assembly line run faster or more efficiently, say by replacing headcount that was previously doing visual inspection; and 2) can it make money? For example, by adding a cool feature to a product that makes it more useful. Maybe this is a barcode scanner, that by using machine learning, it can remove wrinkles in a package that was previously making it impossible to scan accurately.”
Infineon’s Furtner said this actually begs the question, “What is the benefit of edge AI?” He said, “Actually, the great thing about the edge is, that we can turn its “weaknesses” with respect to constraints into strengths. People do care about things like ease of use, functionality, privacy, security, cost, climate or sustainable use of resources. These are all benefits we can make possible with edge AI. We are convinced that AI at the right places enhances our life, and that there are many use cases for AI in endpoints. Edge AI is used for predictive maintenance and further automation or robotics, home automation or smart farming to name a few applications. With our work on low power AI-enabled sensors we make intuitive sensing more ubiquitous, spurring new applications in the home or city that can make lives easier, safer and greener. Being non-dependable on the cloud enables fully new usage models in industry or home applications that cater for privacy and security.”
Also, he said, edge-AI provides the ability to drive value out of the exploding amount of IoT data in a more resource-efficient and thus sustainable way, which is paramount in times of climate change.
Jeff Bier said application requirements would drive the need for edge AI in five key areas:
- Bandwidth: even with 5G, there may not be sufficient bandwidth to send all of your raw data up to the cloud.
- Latency: many applications require faster response time than you can get from the cloud.
- Economics: even if a given application can technically use the cloud in terms of bandwidth and latency, it may be more economical to perform AI at the edge.
- Reliability: even if a given application can technically use the cloud in terms of bandwidth and latency, the network connection to the cloud is not always reliable, and the application may need to run regardless of whether it has this connection. In such cases, edge AI is needed. An example is a face recognition door lock; if the network connection is down, you still want your door lock to work.
- Privacy: even if a given application can use technically the cloud in terms of bandwidth, latency, reliability, and economics, there may be many applications which demand local processing for privacy reasons. An example is a baby monitor or bedroom security camera.
How smart should the edge be? It depends on memory capacity
This might seem an obvious question, but the reality is that you need to be pragmatic, with each application being different.
“Smart is generally not the limiting factor — the limit is memory capacity,” comments Levy. “In practice, memory limits the size of the machine learning model that can be deployed, especially in the MCU domain. And to go one level deeper here, a machine learning model for say a vision-based application will require more processing power and more memory. Again, processing power is more of a factor when real-time response is required.”
“An example I give is a microwave oven with an internal camera to determine what kind of food was put in — a 1 or 2 second response time could be sufficient, thereby enabling the use of something like an NXP i.MX RT1050. The amount of memory will dictate the size of the model, which in turn dictates the number of food classes the machine can recognize. But what if food is inserted that isn’t recognized? Now go to the gateway or the cloud to figure out what it is, then use that information to allow the smart edge device to retrain. To directly answer the question about how much ‘smart’ to include, it all boils down to tradeoffs of performance, accuracy, cost, energy. To add to this, we are also working on an application that uses autoencoders for another form of ML, anomaly detection. In short, autoencoders are quite efficient and one example we implemented only took 3kbytes and did inferencing in 45-50 µs — easily the job of an MCU.”
Furtner echoes the pragmatic approach. “Edge AI is heavily constrained concerning energy consumption, space and cost. In this case the question is not ‘how much smartness we should put into the edge’ but ‘how much smartness can we afford in the edge?’ And the follow-up question would be ‘which of the known AI techniques can be slimmed down in a way that they are sufficiently ‘tiny’ to be applied in the edge?’ So, certainly power consumption limits the amount of endpoint intelligence. These endpoints are often powered by small batteries or even depend on energy harvesting. Data transmission costs a lot of energy too.”
He adds, “Let’s take a smart sensor. For local AI to function properly under these circumstances it has to be optimized for its specific properties and behaviors. In addition, some new sensors only will become possible through the embedded AI; for instance environmental sensors for liquids and gases. There are many reasons for endpoint AI. Intelligent data usage and reduction or fast real-time local reactions are obvious ones. Data privacy and security are others. Massive sensor raw data can be processed where it is generated – whereas intensive compute tasks remain in the cloud. Recent advances in lowest power neural computing (for example, edge TPUs, neuromorphic technologies) shift this boundary in favor of the edge and endpoint nodes.”
Imagination Technologies’ Grant said, “To our mind, it’s obvious to put as much intelligence as possible in the edge, and then software optimization can be used during the lifetime of the device.” He likens this to the games console industry, where the vendors release a new console but then it is optimized with software updates over the life of the hardware. He added that adding a neural network accelerator to a system-on-chip (SoC) is not significant from a cost or size viewpoint. “So the opportunities to speed up at the edge are really dramatic.”
Arm’s Bergey said, “As heterogeneous compute becomes ubiquitous throughout infrastructure, it is critical that we are able to identify where it makes most sense to process data, and this will vary from application to application and may even change based on time of the day. The market requires solutions that will enable the handing off of different roles to different layers of AI in order to gain the kind of overall insight that drives real business transformation. At the edge, AI is set to play a dual role. At a network level, it could be used to analyze the flow of data for network prediction and network function management – intelligently distributing that data to wherever it makes most sense at the time, whether that’s the cloud or elsewhere.”
Adesto’s Intrater added, “The decision of how much ‘smart’ should be put at the edge is dependent on the specific application, and how much latency it can handle (not much for real-time mission-critical applications), what the power envelope is (very small for battery operated devices), security and privacy concerns, as well as whether there is an internet connection. Even with an internet connection, you wouldn’t want to send everything to the cloud for analytics because of the bandwidth expense. The division of the smarts across the edge and the cloud is about balancing all these concerns.”
He continued, “You could also do AI on a local edge server, and of course training and analytics are often done in the cloud. Often, it is not a straightforward decision of where AI happens – the ‘smarts’ are often distributed, with some happening in the cloud and some in the edge device. A typical AI system has such a split between which AI is done locally and which is done remotely. Alexa/Siri are a good example, where there are algorithms in the device for voice/keyword recognition, and then the interactions from there take place in the cloud.”
What are the enabling technologies?
“There are many key enabling technologies for edge AI. Perhaps the most obvious is high-performance, energy efficient, inexpensive processors that are good at running AI algorithms,” said Edge AI and Vision Alliance’s Bier. “But there are many others. Some of the most important are (1) software tools to enable efficient use of these processors, and (2) cloud platforms to aggregate metadata from edge devices, and to manage the provisioning and maintenance of edge devices.”
As you’d expect, most of the companies we spoke to provide a range of devices and IP for edge AI. Infineon said it provides sensors, actuators, microcontrollers including NN (neural network) accelerators and hardware-security modules for the IoT. “Power efficiency, safety and security are part of our key competences. With our portfolio we help to link the real with the digital world offering secure, robust and energy-efficient AI-solutions for the edge,” Wolfgang Furtner said.
Nick Ni at Xilinx said taking AI edge products to market is non-trivial as engineers need to blend machine learning technologies with conventional algorithms like sensor fusion, computer vision and signal transformation. “Optimizing all the workloads to meet end-to-end responsiveness requires an adaptable domain specific architecture (DSA) which allows programmability in both hardware and software. Xilinx SoCs, FPGAs and ACAPs provide such adaptable platforms to allow continuous innovation while meeting the end-to-end product requirements.”
NXP said its enabling technologies include hardware and software. Levy said, “There are customers that use our low-end Kinetis or LPC MCUs for some smart functionality. It does start to get more interesting at our i.MX RT crossover processor level, where we provide integrated MCUs with Cortex M7’s running at 600-1000MHz. Our new RT600 includes an M33 and HiFi4 DSP, whereby we enable medium performance machine learning by running in a heterogeneous mode using the DSP to accelerate various components of a neural network. Moving way up the spectrum, our latest i.MX 8M Plus combines 4 A53’s with a dedicated neural processing unit (NPU) which delivers 2.25 TOPS and 2 orders of magnitude more inference performance (and runs less than 3W). This high end NPU is critical for applications such as real-time speech recognition (i.e. NLP), gesture recognition, and live video face and object recognition.
From the software perspective, Levy said NXP provides its eIQ machine learning software development environment, to enable open source ML technologies across the NXP portfolio from i.MX RT to i.MX 8 apps processors and beyond. “And with eIQ, we give customers the option to deploy ML on the compute unit of their choice – including CPU, GPU, DSP, or NPU. You’ll even see heterogenous implementations that run a voice application like keyword detection on the DSP, a face recognition on the GPU or CPU, and a high-performance video application on the NPU, or any combination thereof.”
Arm’s Bergey said, “As we move towards a world of one trillion IoT devices, we’re facing an infrastructural and architectural challenge greater than ever before – and as such, the technology we need to answer this great opportunity is constantly evolving. At Arm, our focus is on providing highly configurable, scalable solutions that meet the performance and power requirements to enable AI everywhere.”
For AI at the edge, Adesto provides enabling technologies including ASICs with AI accelerators, NOR flash memory for storing the weights in an AI chip used for voice and image recognition, and smart edge servers that connect legacy and new data to cloud analytics like IBM Watson and Microsoft Azure.
Intrater added, “We are also exploring in-memory AI computing with our RRAM technology, where individual memory cells serve as both storage elements and compute resources. In this paradigm, the matrices of the deep neural networks (DNNs) become arrays of NVM cells, and the weights of the matrices become the conductance of the NVM cells. The dot product operations are done by summing up the current that results from applying the input voltages on to the RRAM cells. Since there is no need to move the weights between the compute resource and the memory, this model can achieve an unrivalled combination of power efficiency and scalability.”
To me, there’s a very clear distinction between edge AI and endpoint AI. The endpoint is the point at which the physical world interfaces with the digital world. But the way the edge is defined is very elastic. Vendors vary from saying everything that is not in the data center is the edge (which includes the gateways, to edges of networks, to cars), to those who define endpoints as a subset of edge.
Ultimately, the definition is irrelevant. It comes down to the application and how much intelligence one can practically place at the endpoint, or at the edge. This comes to the tradeoff between memory availability, performance needs, cost, and energy consumption. This will determine how much inferencing and analysis can be done at the edge, how many neural network accelerators are needed, whether this is part of an SoC, or whether it sits with a CPU, GPU or DSP. This is not to forget innovative new ways of looking at solving the challenge, using techniques like in-memory computing and AI.
There is broad consensus on this: how much intelligence you put in is dependent on application, and requires a pragmatic approach based on the available resources.