Foretellix co-founders believe car OEMs cannot verify the safety of an AV through miles-driven only. They focus on the quality of coverage.
Complex systems — whether a system-on-chip or autonomous vehicles — can frustrate design engineers who, after months of painstaking work, have to go back and verify that the system they just designed actually performs the way they intended.
SoCs and autonomous vehicles (AVs) are both built in a “black box,” which by nature makes it hard to find bugs “hiding in places that you don’t think about,” said Ziv Binyamini, CEO and co-founder of a Tel Aviv-based startup called Foretellix.
In testing and verifying an SoC, two measures are deemed essential: “code coverage,” which tells how well the code is tested by stimulus, and “functional coverage,” a way for the user to write certain instrumentation logic that monitor how well the stimulus is covering various functions.
Foretellix believes that similar coverage-driven disciplines should apply to AVs when car OEMs test safety.
Today, vehicles from tech companies and OEMs are racking up millions of test miles in simulation, on test tracks, on public roads. Last month, Waymo, for example, announced that the company has driven more than 10 million street miles and some 10 billion simulated miles.
But here’s the rub:
Does anyone know what, exactly, companies like Waymo, Uber, Cruise and Argo AI are testing? How do they measure test results? What testing scenarios have their AVs experienced?
As Foretellix’ Binyamini sees it, today’s mileage-driven race among AV companies — looking to prove the safety of their products – lacks “a quantifiable way to measure how much of the scenarios required to prove the safety of an autonomous vehicle have been exercised (covered).”
Moreover, they lack tools that could “provide a rigorous & automated way to uncover unknown risk scenarios and turn them into known,” he noted.
This is where Foretellix sees its opportunity. Foretellix, based on a team of verification experts who have grown up in the EDA industry, is migrating its expertise to the AV world.
For example, just as the EDA industry decades ago developed a high-level hardware description and hardware verification language called SystemVerilog for SoC designers, Binyamini told EE Times that Foretellix is developing Measurable Scenario Description Language (M-SDL) for AV system designers.
M-SDL is currently being “tried out” by a few car OEMs in the United States and Europe, according to Foretellix. After gathering feedback, the current plan is to release it after the summer, said Binyamini. He also stressed that M-SDL is not proprietary. “This will be made open on GitHub.”
Foretellix is promising that M-SDL will offer “unified metrics” of test results – whether done in simulation, on test courses or the on road. “We are also injecting random testing to see what scenarios still need to be tested.”
Coverage-driven verification (Source: Foretellix)
Nexus of EDA and Automotive worlds
Mike Demler, a senior analyst at The Linley Group, cautioned that Foretellix is not building a verification tool to AV system design. Rather, it is proposing “a coverage analysis tool and coverage-driven verification” for AVs, he noted.
While acknowledging that the very idea of “coverage-driven verification” comes from EDA, Demler stressed that “coverage is a tool to check your verification plan, but it’s not a verification tool itself. A coverage tool checks that your test benches cover all the possible faults, or a sufficient number to satisfy a particular signoff criterion.”
So, in Demler’s view, Foretellix’s comparison of M-SDL to SystemVerilog is “a big stretch.” This looks more like “a test plan checker,” he said.
Nonetheless, the background of Foretellix’s founders strongly suggests that the technology that has strong roots in the semiconductor industry is what Foretellix is now trying to bring to the automotive industry.
For anyone who has lived through the era of growing complexity in chip designs, the designs emerging in autonomous vehicles are almost familiar. Binyamini observed, “These are problems the chip industry already experienced in 1990s.”
When Intel was developing Pentium Pro, Binyamini was a design automation engineer in the P6 project. Because the P6 design was the first X86 super pipeline, out-of-order speculative execution machine, the processor was “extremely complex. It required new verification solutions to deal with that complexity.”
Before the P6 launch, Intel faced the “Pentium bug” crisis, a defect in the floating-point in early Intel processors. The bug, discovered by a professor at Lynchberg College in 1994, was reported by EE Times. By December 1994, Intel had recalled the defective processors, at a cost of almost half a billion dollars. The incident made the electronics industry aware of the near impossibility of finding all of the bugs and problems inside a complex processor.
By 1997, Binyamini joined a startup called Verisity founded in 1995 by Yoav Hollander, a leading expert in VLSI verification. Verisity was billed as one of the world’s first verification companies, tasked with delivering a tool suite for VLSI verification, based on a coverage-driven methodology.
Verisity told the semiconductor industry that coverage-driven verification “is the only way to deal with the complexity of chip designs.” At Verisity, Hollander created the “e” verification language, which later became a standard (IEEE 1647).
Verisity, in 2005, was acquired by Cadence — where both Binyamini and Hollander worked through the following decade and led its verification business.
In 2015, the two executives left Cadence and co-founded Foretellix, seeking to solve the AV industry’s dilemma.
On one hand, design errors in an SoC could trigger a costly chip re-spin. If chip engineers are spending half their efforts designing chips, the other 50 percent they spend on verification could make or break the completion of chips.
On the other, any flaws in AV design could result in loss of a human life.
And yet, in Foretellix view, the AV industry is still stuck in the old race for “quantity of miles” instead of the “quality of coverage” required for safety testing and verification.
Systems in highly automated vehicles are complex enough. But adding to test scenarios a few environmental and behavioral factors — bad weather, road conditions, other vehicles cutting in and out — are added to testing scenarios, the test regimen grows increasingly unwieldy. Despite such challenges, Foretellix claims that its tool, Foretify, can deliver a “measurable safety” through the use of M-SDL, and automated ways to generate “combinations of various scenarios. The tool also has the ability to “randomly” create combined scenarios and monitor to check and track scenario coverage.
Foretellix tool deals with many scenario variants. (Source: Foretellix)
Where Foretellix’ tools will be useful
Asked about Foretellix, Phil Magney, founder & principal advisor of VSI Labs, said, “What I like about it is it [appears to] apply all test platforms whether you are doing full on simulation, x-in-loop testing, test tracks, or on public roads. The Foretify solution as they call it manages all the testing formats and gives you the analytics and the metrics to know when you have full coverage.”
In today’s AV test environment, the available metrics are extremely limited. Often, the only accessible measures are disengagements and the number of miles AVs have travelled thus far.
By law, people actively testing self-driving cars on California roads must disclose the number of miles driven and how often human drivers had to take control, a moment of crisis known as a “disengagement.”
Many experts today don’t believe disengagement is the right metric.
Phil Koopman, CTO, Edge Case Research, told us that Disengagements tend to incentivize test operators to minimize interventions. Hence, unsafe testing.
Magney agreed. Noting that disengagements “have to be taken with a grain of salt,” he said, “In the development phase you are still learning and exposing the vehicle to a variety of conditions. Quite often you are isolating certain technologies or approaches to see where they encounter difficulties. In other words, you are not running your full stack because that would make it harder pinpoint the performance of a subsystem.”
Magney added, “The other element that’s a bit common sense but worth pointing out is that developing to 95 percent is only part of the work. The remaining percentiles are an order of magnitude harder and a much more dangerous place.” In his view, that five percent holds a huge, diverse and very-lightly-tested behavior space. Magney believes the Foretellix solution addresses that critical twentieth.
Similar to Demler, Magney also said that Foretellix’ tool is more about verification [of testing coverage] than “the actual development of an AV.”
Scenario description languages
But what about a scenario description language? Is M-SDL the only game in town? Binyamini acknowledged that some companies are working on scenario description languages within standardization groups. Others might be developing internal tools. But the industry tends to share the belief that AV developers will benefit from sharing scenarios common with other companies, while planning to review and reuse scenarios built by other companies.
Binyamini said Foretellix is proposing M-SDL within the ASAM (Association for Standardization of Automation and Measuring Systems), a non-profit organization that promotes standards for tool chains in automotive development and testing.
SDL is critical, especially for test engineers seeking an objective understanding of test scenarios that more closely represent the real world.
Binyamini also noted that a unified SDL will help build transparency, enabling regulators to see what AV testing has been done. Regulators, for example, could compile a repository of test scenarios, ensuring that OEMs share the same understanding. Because M-SDL is written at a higher level, it is readable by regulators and the public, he added, it could ultimately help achieve public trust.
Magney also pointed out that Foretellix is not alone in thinking about ways to verify automated driving. Pegasus, for example, a German based consortium, is working to establish generally accepted quality criteria, tools and methods, in addition to scenarios and situations for the release of highly-automated driving functions.
Magney noted, “However, Pegasus is targeting automated driving (L2-L3) rather than full automated driving (L4+). Pegasus’ method uses expected distributions, which are good for estimating the frequency of expected failures.” Magney added.
“But as Foretellix points out in their blog, you want unexpected distributions that ensure unexpected failures occur more frequently than they otherwise would.”
If all goes as planned, Magney suspects Foretellix to set the stage for AV verification through a continuously-updated library of parameterized scenarios. However, he cautioned, “It will take time to build up the scenarios. After all, in the interest of ISO21488, SOTIF (the Safety Of The Intended Functionality), you don’t know all those situations (or scenarios) where you are going to be exposed, you will learn of them in time.”