Oculi Tech Mimics Human Eye

Article By : Sally Ward-Foxton

Oculi's pixel design means its vision sensor can deliver smart events or actionable metadata to improve power and latency for machine vision.

An early-stage company spun out of Johns Hopkins University wants to make machine vision more like human vision by adding memory and computing to each sensor pixel. Oculi is developing products for gesture recognition and eye tracking in consumer AR/VR systems. Other applications include smart city infrastructure and eventually, automotive vision sensing.

Beyond buzz over existing event-based vision sensing frameworks, Oculi CEO Charbel Rizk told EE Times there’s plenty of room for innovation elsewhere.

Oculi Charbel Rizk
Charbel Rizk (Source: Oculi)

“The problem that we’re running into right now with machine vision is that we’re using sensors and processors that were developed for different purposes, putting them together and thinking if we throw enough processing downstream we solved the problem,” Rizk said. “That’s not the case, because the problem really starts at the sensor. Machine vision is not about pretty images. It really should be about efficiency: How do we get the information in an efficient way?”

Oculi’s sensing and processing unit (SPU) chip is based on integrating both capabilities at the pixel level, similar to how the human eye works. An Oculi pixel includes a sensor alongside digital processing – logic and a small memory – making the pixel smart enough to deliver information if it detects something of interest. A full frame output is still possible if the application requires it; the user can choose from different output modes to optimize privacy, latency and power consumption via software. For most applications, the SPU runs on milliwatts of power, Rizk added.

Similar to existing dynamic vision sensors that use event-based vision, Oculi’s sensor can output events that detect changing pixel data. Rizk said this event type lacks the efficiency needed for emulating human vision.

“There are times when you’re not looking for changes, you’re looking for something in the scene,” he said. “Event sensors out there today don’t give you any information at that point, they become blind. There are times when you do want the full frame… so we built our architecture to allow you to get all these outputs using software.”

Oculi vision sensor SPU
Oculi’s SPU combines vision sensing, processing and memory at the pixel level. (Source: Oculi)

With every pixel capable of some basic computation, algorithms can be implemented on the SPU without external processing. On top of full frame images and events, that allows two additional forms of output.

One is “smart events,” using less than 10 percent of bandwidth compared to a full-frame image, but containing sufficient information for an application. Smart events can also be based on color or depth sensing (using two SPUs for high-speed stereo vision). Crucially, smart events vastly reduce signal noise compared to basic event-based vision; memory in each pixel means consistency can be evaluated over multiple frames to help eliminate noise. Bandwidth is also reduced when compared to purely event-based vision.

A successful field test of Oculi’s sensor in Chicago counted axles of passing vehicles for electronic toll billing; the test also estimated speed using smart events.

Oculi vision sensor
Possible outputs from Oculi SPU (L-R): Full frame image, events, smart events, actionable data (top row is number of vehicle axles, bottom row shows the hand gesture “swipe right”). (Source: Oculi)

The other output, “actionable information,” reduces bandwidth even further by processing smart events on the SPU using pattern recognition techniques. For some applications, further vision processing is not required.

For example, Oculi hardware deployed as part of a smart city infrastructure field test was reprogrammed at the customer’s request to function as a flash-flood alert system. The sensor was calibrated to count raindrops falling in front of a camera to estimate precipitation. (This was done by identifying the distinctive size and motion of raindrops). Computing rainfall estimation was accomplished entirely on the SPU.

Gesture recognition

Oculi was spun out of Johns Hopkins University in 2019 based on Rizk’s academic work, including multiple generations of test chips. The startup is in the process of closing a seed funding round while negotiating with potential foundry partners.

While the technology was originally focused on military applications (early demonstrations detected muzzle flashes), Oculi is now pursuing both consumer and automotive applications. For now, the company is targeting gesture recognition and eye tracking in consumer AR/VR systems. AR/VR vendors want to eliminate handheld remote controls and conserve battery power in headsets. Oculi is also working with automotive manufacturers on future ADAS/AV opportunities. Smart city infrastructure, facial recognition and person detection are also being considered.

Oculi’s roadmap includes a product family with varying levels of on-chip processing capability. Future devices could add AI capabilities on-chip. Engineering samples on demo boards (single and dual/stereo SPUs) and a software development kit are available now.

This article was originally published on EE Times.

Sally Ward-Foxton covers AI technology and related issues for EETimes.com and all aspects of the European industry for EE Times Europe magazine. Sally has spent more than 15 years writing about the electronics industry from London, UK. She has written for Electronic Design, ECN, Electronic Specifier: Design, Components in Electronics, and many more. She holds a Masters’ degree in Electrical and Electronic Engineering from the University of Cambridge.


Subscribe to Newsletter

Leave a comment