For many years, design verification has been the biggest challenge for any company involved in ASIC design. To address verification challenges for increasing complexity in ASIC designs, over the period of the last couple of decades, various hardware verification languages (HVLs) have emerged. Starting with Verilog as a basic verification language around 2000, various HVLs like Open Vera and System Verilog have gained popularity.

The semiconductor industry has widely adopted System Verilog as an effective and user-friendly HVL. EDA companies come up with something new every few years to speed up and add value to the verification cycle of an ASIC design. Like HVLs, various methodologies, like eRM, AVM, VMM, OVM, and UVM have become popular in the span of a few years. UVM is the most popular verification methodology that the industry has adopted widely for SV as an HVL. Formal verification techniques are also gaining popularity these days as related tools mature.

Now, the question that emerges here is, in spite of having powerful HVLs, metrologies, and good verification practices in place, ASIC design companies face verification escapes–bugs missed in design validation/verification–for certain scenarios that get caught during the silicon validation phase, or while using the end product at the customer end. Such escapes lead to a respin of the ASIC, which is of course very expensive as far as the product cost and time-to-market is concerned. Sometimes, the company has to go with multiple respins of the ASIC if new issues are caught after a respin is done and a hardware or software workaround is not possible. This paper discusses some causes that lead to silicon bugs.

Problems

Scenario miss due to missing details in Design Requirements Specification For design team as well as verification team, the reference document of use is the Design Requirements Specification (DRS). Different people give different names to such a document that describes all detailed functionalities of targeted design block(s). Some call it Micro architecture document (MAS). It’s a document that is mostly created by the designer and reviewed by design lead, chip architect and other concerned authorities.

The source of this document is the design datasheet or a higher level application document that describes top level features supported by a major design blocks. If some details of a specific feature is completely missed for capture in DRS OR is partially documented OR something that has not been clarified properly before, it leads to a miss in capturing related scenario in Verification Plan (vPlan). During the review, if no one (verification engineer or designer) notices what happened, then definitely that functionality results in Verification escape, finally leading to a bug in Silicon.

Software/ End Customer use case scenarios missed to capture in Verification Plan (vPlan) Scenarios that a software team uses for the product validation are closer to the end customer use case scenarios. The kind of configuration combinations that are covered here by the software product validation team may looks simpler for specific design block perspective but at higher level along with other blocks configurations, they may result in cases that are not part of Design vPlan at all. Though in-depth verification features for complex scenarios are captured in vPlan for all major blocks while doing Design Verification of individual blocks, such Software/end user use case scenarios miss sometimes uncover hidden design bug on Silicon ASIC.

Verification escape due to issue in RM or Assertion check model Many design features have dependence on Reference Model’s (RM) predicted results which are compared against actual results from design to ensure design correctness. In addition to that, there are many features that have dependence on Assertion checkers to ensure design correctness. These are the kind of checks that are hard to predict accurately by RM OR the kind of checks that ensure timing relation across some interface control signals.

For complex designs, there are some scenarios like “design should not drive specific values on specific interface pins” or “specific values would never appear on some control signals in specific order.” Assertions written for such cases which are not at all possible to hit as per design spec, are not expected to trigger at all for any possible violation check. The reason follows the principle of design which is simply not possible. This means that for such checks, an assertion is not written correctly (i.e. it’s both PASSING & FAILURE conditions are not verified properly), it’s hard to catch the design issue if it’s present. Because by design it’s not expected to happen but due to issue with the assertion even if it’s getting hit in simulation it’s never caught. Other data checkers may not shout for an error for such condition.

For better understanding, let’s take an example. Block B1 is driving some packet control information to block B2. It drives 2 bit signal indicating SOP (a cycle with value 2’b01), MOP (second onwards one or more single cycle with value 2’b01) and EOP(2’b11) where SOP è Start of Packet, MOP è Middel of Packet and EOP è End of Packet. Now design spec for B1 says, for each EOP, last MOP and EOP indications has to be in back to back cycles. This mean for 4 buffers packet it drives 2’b01(SOP)=>{N {2’b00}}(IDLE) => 2’b01(MOP)=>{N {2’b00}}(IDLE) => 2’b01(MOP)=>2’b11(EOP) which are legal transitions. All transitions except last two are apart by N cycles of 2’b00 IDLE data. As per spec 2’b11(EOP) must have MOP(2’b01) in its immediate previous cycle. If block B1 violates this condition while driving this info to block B2, for certain configurations of block B2, B2 design may not behave properly. Now let’s say for almost 99%+ of cases B1 block drives this control signal as per spec but just for some specific corner case(s) it violates that. So if there is some issue in assertion coding (which is verified with 99%+ of passing cases), it doesn’t trigger at all for conditions that violates the design spec. Also unless and until a scenario with specific configuration in B2 is present along with <1% conditions for which B1 violates this, B2 doesn’t misbehave.

What all this basically means is that for very specific cases (which are hard to think considering design complexity) such issue results in other failures related to block B2. This way the assertion coding issue leads to a bug in Silicon for such cases.

Similarly, if RM prediction logic is coded such a way that it matches the design behaviour but not what the reference design spec says (this can be due to misinterpretation of design spec for some complex scenario), it can leads to bug in silicon.

Corner case scenario misses which are hard to discover in advance Whatever scenarios are captured in verification plan (vPlan), there are those that Verification team must focus on. Such a vPlan is the reference document for verification team. During verification phase, a vPlan document is updated and reviewed multiple times. Initially while reading/understating design specs, verification engineers start capturing feature details in vPlan. Design understanding is not that much mature during this initial phase so it’s not possible to think of and add details of all possible complex scenarios.

During VE development and once actual verification starts for certain features, vPlan is revisited and updated for more accurate details. During vPlan feature closer, it’s updated for actual mapping of coverage/test cases/assertions items targeting specific features. During final closure, when the design knowledge of all supported functionality is at peak, vPlan is updated for more complex scenarios that verification engineers can think of. vPlan is also get reviewed multiple times by design teams to get their feedback on possible scenarios that they want covered considering their knowledge of how things are actually implemented.

After all such reviews and multiple visits of vPlan, if team (verification + design) misses to capture some real interesting scenarios in vPlan then such scenarios will result in verification escape and leads to possible bug in silicon. Though 99.99% of complex scenarios are addressed during verification phase, there are chances of hiding design bug in remaining 0.01% features that is due to such scenario miss in vPlan. This means 0.01% of features play very important role.

False functional coverage numbers Functional Coverage plays a very important role in measuring the verification progress during design verification. They can hit 90% coverage for any design but remaining 10% consumes a good amount of time during coverage closure phase. Implementing functional coverage in right manner is very important. It’s not only “What to cover” which is important but also “when to cover & how to cover”. Triggering the coverage upon right condition is very, very important. If enough attention is not given to this, there are full chances of getting false coverage number and it’s human tendency to examine things in more details that can result in some failure OR coverage holes. Things which are covered (irrespective of improper triggering conditions) is given less importance. Such a false coverage numbers can surely lead to verification escape of important features which are planned in vPlan but not really covered.

Product backward compatibility issues For any product company, backward compatibility to previous versions of the same product is equally important in addition to the advancement to support additional features in the next versions of the product. This is to keep their existing customer happy with whatever old product versions features they have been using should continue to use as it is. Customers should be able to easily move to advanced version of the product without losing the compatibility with the old features with older product that they have been using.

Keeping this in mind while defining advanced features for all new version of the product, they have to keep in mind the support for old features. Future ASIC targeted to new product should support all those legacy features (assuming it’s a need). Design behaviour should not differ if similar scenario that got run on older or newer design. Sometime the advancement to next version is so huge to support many new rich features that there are chances of incompatibility with old behaviour if those design changes are not considered with respect to legacy behaviour. Such an incompatibility with reference to legacy behaviour can lead to bug in silicon.

Solutions

There can be multiple reasons that lead to verification escape of important design features and resulting in a silicon bug. Design verification has to be done with care during each state of the verification phase to avoid problems later on which are much more expensive in all aspects. With respect to the possible causes described above, here are steps to reduce the possibility of verification escapes.

  1. DRS should be perfect with crystal clear high quality details for all supported design features targeted. Verification engineers should be allowed to file spec bug against DRS for insufficient or unclear details in it.
  2. vPlan should cover scenarios considering the End Use cases if possible to get those details. Software product validation team if available should also give their inputs on missing high level scenarios if any. If time permits, they should take a look at the vPlan prepared by verification teams.
  3. Assertion checks implemented have to be verified first before they can be trusted to catch design issues. It’s important to break such assertion checks either by simulating illegal scenario (if possible) or by an artificial mechanism of force/release of contributing signals. They should fail if simulation runs a scenario that leads to condition violation for which it’s written.
  4. RM should also be verified or spot checked first to ensure the predicted results are indeed as per what design spec says. Incremental manual checking of necessary debug displays for different functional path would help here. Also some directed (may be for basic scenarios) tests with self-checks could help as we can compare RM & RTL results with our self-checks.
  5. Design and Verification teams should jointly ensure that the vPlan is solid with respect to all possible design features mentioned in DRS. All legal/illegal scenarios including corner cases have to be discussed and necessary features for verification captured in vPlan. Multiple reviews at different points during design/verification cycle would help a great deal in terms of adding/updating additional scenarios.
  6. Functional coverage implementation has to be reviewed by peers. Along with what it’s intended to cover, more focus should be on the right triggering event to capture coverage. If functional coverage is implemented using design signals/variables, they have to be reviewed by designer to ensure they are indeed correct design signals considering targeted bins.
  7. Important features of previous revision of the design have to be considered for design update to support additional features for next revision. Chip architect and designer should jointly work for this.

Keeping the points described here in mind, the risk of design verification escapes will be greatly reduced, resulting in potentially major cost and time savings.

This article first appeared on EDN.