Computer-Vision Challenges in AVs

Article By : Giovanni Di Maria

Despite accelerating technology advances in autonomous driving, there are still some challenges to overcome to match (and surpass) the human ability to drive a vehicle.

Despite accelerating technology advances, autonomous driving is not yet 100% reliable. There are still some challenges to overcome to match (and surpass) the human ability to drive a vehicle.

What is computer vision?

Computer vision concerns the recognition of objects and the analysis of the outside world through specific cameras. Not so long ago, it was unthinkable that computer vision could be applied to the automotive sector. Now, it is supported by artificial intelligence, a powerful new pillar of information technology.

Computer vision has different purposes, such as recognizing people, animals, and objects; understanding the presence of obstacles; recognizing road signs and traffic lights; determining the direction of people and vehicles; and identifying and reading vehicle registration plates. This is an extremely critical application because human lives are at stake, and the slightest miscalculation must be avoided.

Teaching a computer to see the world around us is a complex challenge. In the future, vehicles will not only learn to recognize the silhouettes of human beings or inanimate obstacles, but they will be able to recognize people’s faces, thanks to the high resolution of the sensors.

The processing of images collected by cameras in real time is the central element of computer vision, and in recent years, progress has been significant. Many companies have been developing chips dedicated exclusively to the acquisition and intelligent processing of images, which are the main input to the vehicle’s driving-decision system.

For example, Arm has developed Mali-C71AE, an image signal processor for multi-camera automotive vision systems. Applications include 360˚ surround view, object detection, lane positioning, road-sign recognition, mirror replacement, reverse camera, and occupant monitoring. Mali-C71AE supports vision systems that need to achieve ISO 26262 ASIL B diagnostic requirements in ADAS applications.

Ambarella has developed the CVflow chip architecture, which is based on a deep understanding of core computer-vision algorithms. Unlike general-purpose CPUs and GPUs, the Santa Clara, California–based company claims its CVflow includes a dedicated vision-processing engine programmed with a high-level algorithm description, which allows the architecture to scale performance to trillions of operations per second with extremely low power consumption.

Contextual awareness

Over the years, object-detection chips have become more powerful in terms of computing power, operating speed, and high-resolution image analysis. High resolution and sensitivity are two essential elements of automotive computer vision.

The first enables greater object identification, while the second allows for detection in low-light circumstances. The requirements for driving-proof machine vision are quite high. One of these is the system’s rapid response time, which must receive and analyze responses from pictures in a matter of milliseconds.

Today, 3D vision is undeniably useful for computer vision. By analyzing the 3D images, the systems can detect reliable and precise information on the trajectory of the car, any obstacles, and the movement of other vehicles (see example of detection in Figure 1). Some types of sensors currently include ultrasound, laser, radar, light, acoustic, and optical systems. In the future, cars will learn to communicate and talk to each other thanks to vehicle-to-vehicle systems that will allow intelligent exchange between cars.

The AI system, through sophisticated and fast algorithms, can identify everything that surrounds the car thanks to the detection of a network of sensors of different types.

After a thorough analysis, the AI system sends and issues the appropriate commands to ensure safe driving. In other words, it calculates the cruising speed and prepares the commands for a possible emergency braking, or it makes sure that the car does not exceed the speed limits. All situations must also be communicated to the driver by means of visual and audible alarms.

Figure 1: The sensors present in the vehicles of the future will be more attentive than distracted drivers. (Image: Di Maria superimposed Unsplash photos to explain his view)

Identification of barely visible objects

The main obstacle to processing information is to be able to obtain sharp and very high-resolution images. The human eye is probably the most complex video camera in existence, and its ability to automatically adapt to different light and operating conditions, combined with high-quality optics, allows extremely detailed information to be sent to the brain.

Technology is making tremendous strides, but it will take a long time for digital video cameras to match and surpass the possibilities offered by human nature. Here are some key elements for image processing with the greatest possible accuracy:

  • Scanning and acquisition speed
  • Very high resolution of the camera
  • Acquisition sensitivity even in unfavorable lighting conditions

These features are suitable for automated digital systems. To improve the results obtained, also in terms of security, higher-resolution and longer-range sensors are implemented.

These sensors have very high resolutions, equal to 2,000–3,000 lines of image, which is 10× the quality obtained by the traditional methods used until now. The information collected by these sensors is reliable and consistent with real-world information while remaining immune from any external interference.

Figure 2: Long-range radar sensors can detect other vehicles at a distance of several hundred meters and allow the system to intervene promptly in critical situations. (Source: Bosch)

Recent advances promise to go further. Researchers have tried a new approach to detect elements on the road, even if they are partially or totally hidden behind other objects. Using neural-network methodologies, the system can reconstruct the hidden parts of people and objects by analyzing only the visible parts.

A research team at Princeton University is working on the application of Doppler radar to detect and track hidden objects, while the Gwangju Institute of Science and Technology in South Korea is developing a neural network that would allow the machine to manage occluded objects in its own space.

This could give way to “almost human” sensors, as it is the human brain that reconstructs the missing parts of an obstacle, observing and analyzing only some visible elements and working on hierarchical databases. In practice, this is a real visual deduction, which would allow 100% autonomous driving.

AI in cars: challenges and solutions

Focusing an object is one of the most difficult tasks in image processing. The system must process high-resolution images that are continuously moving, with highly variable distance and angle over time and with optical conditions that change instantly and continuously.

To emulate human behavior with all the safety aspects that come with it, several issues need to be resolved. For example, AI for autonomous driving still needs to be improved. This may be possible in the future with quantum computers.

Sophisticated, high-performance sensors with higher speed, resolution, and sensitivity will also be required for the system to obtain the highest-quality images. These improvements would be useless if there were not very sophisticated optical and acoustic sensors on the horizon with improved resolution and range to acquire the information with the highest possible quality.

The data and information collected is a key asset, which must be used to populate huge databases. Safe vehicle driving for all can be implemented only by combining the different requirements in a synergistic way. Furthermore, to get as close as possible to ideal autonomous driving, diversified 360˚ sensors — optical, acoustic, radar, and other types — are needed to implement a large number of “senses,” many more than those of a human being.

Thanks to AI, the most demanding actions, such as facial recognition and the recognition of animals, plants, and objects, should improve. Genetic algorithms, combined with mathematical analysis of facial features and other elements, can provide reliable support to guarantee safety. Some studies are underway on identifying people by their walking style and average stride length, as shown in Figure 3.

There are other useful aspects concerning, for example, driving assistance, helping in dangerous situations, checking whether the driver tends to fall asleep, or adjusting the cabin settings according to the driver’s driving style. By implementing high-level predictive maintenance, AI can also detect possible failures of the engine or other fundamental parts of the vehicle at an early stage.

Figure 3: Some sensors can recognize people simply by pace at which they are moving. (Image: Di Maria superimposed Unsplash photos to explain his view)


It will take some time before the safety level of autonomous vehicles is close to 100%. Companies are working primarily to improve road safety as much as possible and to significantly increase the number of kilometers driven in autonomous mode and decrease those driven directly by humans.

Mimicking the behavior of the human driver is a challenge. In a few years, most vehicles will be connected to the network, but to talk about true autonomy, according to researchers, we will have to wait about 20 years. Even so, the revolution in the industry is underway, and vehicles managed by true AI will be in full service to humans (see the predictive graph in Figure 4).

Figure 4: The autonomous-driving market will increase exponentially in the coming years. (Source: Di Maria created this graph based on Grand View Research data)

Self-driving cars, or autonomous vehicles, are a key driver of innovation in the automotive sector and have potential for growth. Sensors will enable information acquisition, but algorithms and system-management methodologies will be more important to carry out the most demanding tasks in terms of data analysis, processing, and decision making.

This article was originally published on EE Times Europe.

Subscribe to Newsletter

Leave a comment