In covering automotive, I write about 'perception.' I've written about 'driving policy.' But it wasn't until recently when I've finally wrapped my head around what driving policy does to perception...
In covering automotive, I write about perception. I also write about driving policy. But it wasn’t until recently that I’ve finally grasped how driving policy connects to perception.
* * * * *
When I’m behind the wheel, I can sense — from the way he (or, God forbid, she) drives — that a certain fellow driver next to me, behind me, or out front, is a jerk. I just know the jerk is going to cut me off, and he usually does.
Human drivers make a lot of assumptions about other road users, road conditions (bad weather) or what looks like an imminent traffic jam. They adjust their driving strategies accordingly. Such intuitions are critical to road safety.
But what if this is a robocar? How do we possibly show a machine how to infer assumptions about other drivers, and respond appropriately according to its intuitions? Is “intuition” even teachable?
These questions nagged at me while I was writing an analysis last week about video clips posted by Mobileye. The unedited clips showed the company’s self-driving car deftly weaving through heavy traffic in Jerusalem.
Watching frame by frame, over and over, was my crude attempt to plumb the machine’s brain. I wanted to understand what a machine is seeing (or not seeing), how it’s interpreting its situation, and what action it plans to take. But as a non-AV developer, it was hard to empathize. The machine appears to speak its own language and make choices from deep inside its own train of cyber-thought, which I couldn’t fathom.
Welcome to the era of autonomy.
While watching Mobileye’s video, I stumbled into several sequences in which the robocar’s maneuvers left me uncomfortable. I asked Mobileye and AV experts what might be going on behind the scene.
Some answers were surprising and revealing. But mostly, they exposed to something the media tends to overlook or minimize: the point where “perception” and “driving policy” meet. Robocars must indeed come with keen perceptions (we write about this all the time) and better machine learning capabilities (the hottest topic in the media coverage). But we’re beginning to understand that it’s driving policy that could be critical to self-driving cars’ split-second decision-making.
We all want robocars to have a 20/20 vision. Beyond clearly seeing the road ahead, we expect them to detect every object, accurately label it and, using the best available neural networks, and take actions.
What we have today, though, is a merely adequate perception system. Whether driven by a human or a machine, no real-world car has the good fortune to always drive in sunny weather, with its views never occluded by other vehicles, buildings or trees, never confronted with such uncertainties as whether a pedestrian on the sidewalk decides to cross against the light.
Perfect vision is important but that alone won’t make robocars safe drivers.
In his contributed piece in the forthcoming book, “Sensors in Automotive — Making Cars See and Think Ahead” (scheduled for launch on Oct. 19, published by Aspencore Media, which owns EE Times), Phil Koopman, CTO of Edge Case Research and associate professor at Carnegie Mellon University, brought up an example of a child who dashes out into the street to retrieve a ball just as the self-driving car is about to pass by.
“The tricky part of perception and planning is predicting what will happen next in a changing situation…
“Sensors need to provide information not only about object motion and position but also about likely changes in motion.”
Must think differently
Perception or vision systems are “probabilistic by their nature… they are known to have failures,” Jack Weast, Intel’s senior principal engineer and Mobileye’s vice president for autonomous vehicle standards, recently told us on our show, the EE Times on Air Weekly Briefing.
Given that there’s no such thing as a perfect sensor (i.e. “a 100-percent accurate sensing all the time for its lifetime of the car”), Weast stressed, “We’d have to think differently to solve this problem by delivering a sensing capability that is sufficient.”
Interview with Jack Weast starts at -32:15
Safety envelope needed
This is where “bumper bowling” comes in, according to Weast.
Remember the idea of Responsibility-Sensitive Safety (RSS) that Intel/Mobileye has been talking about since 2017?
After its launch, Intel contributed RSS to IEEE. Its framework became a starting point of an IEEE standard discussion for safe automated vehicle decision-making. It is “a mathematical model that defines a safety envelope along with proper responses that the vehicle should take,” Weast explained. “It prevents robocars from getting into an accident by their own faults or from something caused by fellow drivers.”
In short, just like bumper bowling prevents inexperienced bowlers from gutterballs, RSS prevents a self-driving car from veering into accidents.
Assumptions & predictions
Autonomous vehicle advocates are busy claiming how AVs can save a ton of people’s lives. But they don’t give enough credit to where the credit is due. Humans are in principle excellent drivers. They make intuitive assumptions, they use common sense, they tend to respond appropriately to a potentially dangerous situation. Robocars, however, lack intuition.
RSS is an attempt to make those human assumptions and “implicit” traffic rules interpretable by machines, by defining what constitutes a dangerous situation, what causes it, and how to respond. A mathematical formula defines for the machine what a safe distance is and what it means to drive cautiously, Weast explained.
What I didn’t know before I talked to Shai Shalev-Shwartz, CTO of Mobileye and senior fellow at Intel, was that RSS also offers a check on in-vehicle artificial intelligence algorithms that generate driving commands.
Just like perception, AI is also by nature probabilistic.
While watching Mobileye video, I was somewhat alarmed by what experts call “flickering effects” on the visualization software. A self-driving vehicle appears to detect several parked cars but then, seconds later, those cars — parked in the same location — start disappearing. And the number of parked cars keeps changing.
When I asked about this, Mobileye reassured me that the AV software driver is tracking objects even when they don’t appear on the software visualization screen. This is because “the driving policy has a ‘common sense’ layer that includes logic like ‘things cannot vanish into thin air’,” Shalev-Shwartz told us.
That “common sense” layer is provided by RSS.
Shalev-Shwartz also added that an important component of RSS is to “know what you don’t know.” He told us:
This means that at any time, for every area in the 3D road view, we know that either: (1) it is known to be occupied by some road-user, (2) it is known to be empty road, or (3) it is unknown. RSS logic is configured to behave properly in each case. The “unknown” mechanism applies well to objects detected in 2D, but there’s a lot of uncertainty about positioning them in the 3D world.
Is RSS baked into AV software stack?
Given that the driving policy has become the focus of the standard discussion, how will individual companies implement it?
Can RSS, for example, be baked into other company’s own AV software stack?
Intel/Mobileye has contributed its own RSS to IEEE P2846, which Weast is chairing. Other companies have also contributed their safety models, explained Weast. Because this is a technology-neutral standard, there is no requirement for anyone to use “a particular kind of chip or sensor,” he said.
For example, “It’s entirely possible that you could build your own safety model, which would be still conformant with the [IEEE] standard,” he added.
He stressed, “At this point, you know, we’re solving a problem for the industry.” The biggest fear for many AV developers, although rarely spelled out, is that lacking a positive industry contribution, “there may not be an automated vehicle market for us to sell into,” said Weast. If robocars don’t share common “assumptions about other road users,” it will be very hard for AVs to balance safety and utility, he noted.
Who else are on board with IEEE P2846?
If IEEE P2846 is indeed to become a meaningful industry standard, who else is in on it? Weast said, “I’m very pleased to have Waymo as my vice chair, Uber as our secretary, and we have over 25, I think at last count, companies across the OEM community, the tier one community.”
Participants in P2846 activities include some government representatives, research institutions and a mix of different entities. The group expects to complete its first draft either by year’s end or early next year.
How do you alter driving behavior for different geography?
As I wrote in my last analysis, I thought Mobileye’s unprotected left turn, shown on the company video, waved a red flag. Some EE Times readers concurred. Because the Mobileye vehicle inches out into the road and blocks traffic, forcing an incoming motorcycle to stop, this robo-trick doesn’t look like safe driving.
Shai Shalev-Shwartz, however, was adamant in response, noting “It’s completely normal here in Israel as well as in most western countries. Waiting idly for the perfect situation is not useful.”
Blocking traffic might be cool in Jerusalem but calling it normal in most Western countries is a big stretch. Instead, it exemplifies how people’s tolerance for “aggressive driving” differs from city to city and country to country.
Imagine Mobileye bringing this AV software to enable China’s Geely to build “hands-free ADAS.” How does Mobileye edit the software to conform with Chinese driving habits? Does this mean AV developers must develop a different AV stack for every region?
The good news is that driving behavior is not baked into the AV software stack. Weast said, “Really it’s kind of the brilliant part about having these implicit driving rules embodied in the safety model, not necessarily in all of the rest of the automated driving stack.”
In other words, “RSS-based driving policy can be adjusted to match different driving styles (without compromising safety),” promised Shalev-Shwartz.
Common wisdom among AV developers today is this: if a vision system has difficulty guessing what it’s looking at or, worse, gets confused about what to do next, the best course is to add more sensors, such as radar and lidar. Fuse everything together, so that the AV has more confidence and its perception system creeps closer to reality.
Mobileye’s approach to sensor fusion is different. Mobileye’s Level 4 self-driving cars shown in the video clips use no radar or lidar, just twelve cameras.
As Amnon Shashua, president and CEO of Mobileye, announced at CES earlier this year, Mobileye’s Level 4 cars driving in Jerusalem are leveraging AI advancements and running different neural network algorithms on multiple independent computer vision engines. Multiple neural networks create “internal redundancies,” according to Shashua. He also discussed “VIDAR,” Mobileye’s solution for achieving outputs akin to lidar by using only camera sensors.
However, Mobileye is in fact working on its own radar and lidar. So, is this a case of wearing a belt with your suspenders?
Weast explained, “We have a separate vehicle that has radar and lidar only” running in Jerusalem. The goal is to improve the lidar/radar-only car to a perception level equal to the camera-only system.
He noted, “Now you combine those together, and you essentially have redundant but diverse sensing implementations that are operating in parallel. So, we can produce two world models and combine them, as opposed to only being dependent on one world model and depending on it alone for accuracy.”
In other words, two systems helping the robocar hold up its pants.