This is the first in a series of blogs written for PuneChips by Suhas Belgal titled Field Manual for Verification Planning. The blogs deal with functional verification of digital ICs and cover mostly the pre-silicon verification phase.
The objective of this series is to provide a view of the ‘art’ of design verification. Everyone has heard the quote “Verification is an NP complete problem – it can never be done”. If so, how should one schedule the verification program? When is the chip really verification clear for tape-out or production? If it is art, how does one measure the quality? Or, how does one turn it into ‘science’ and bring predictability into the equation? Recently, while reading Robert Pirsig’s famous book “Zen and the art of Motorcycle Maintenance”, the questions of ‘art’, ‘science’ and ‘quality’ of the chip being designed crossed my mind…This series will provide a practical insight into the various aspects of verification, different tools, methodologies, best known practices, key indicators, tracking and management while trying to reflect on the fundamental question of whether verification can be completed. All of these will help bringing in the much needed predictability into the verification program. However, all of the best know practices and automation tools in the world can still not replace the need for engineering intelligence. You still need the best and the brightest minds to tackle the challenges. On the flip side, this is what makes verification challenging and interesting, and will attract the best minds out there. So, sprinkled throughout this series, you will find the term ‘intellectual process’ or ‘intellectual exercise’; the part of the process which still needs human intelligence or an engineering discretion will be identified as the intellectual process or the intellectual exercise.The basic goal of any chip design verification project is to find ‘all’ the bugs before tape-out! One way to bring pragmatism is to clearly identify the context of the statement of bug free design. That is, the design should be bug free for a crisply identified goal such as ‘customer demo/samples’ or ‘to enable software development’. Even with such constraints, it still remains to be an NP complete problem. This is where statistics, probability can come in, and various indicators can be used to define the verification quality.
In addition to the fundamental goal, there are other objectives such as productivity, efficiency, resource usage, ‘finding critical bugs earlier’ and so on. These are as important, as a matter of fact, even more important oftentimes than the basic question of ‘have we caught all the bugs’.
Basic Dimensions
The three basic dimensions of verification are ‘Coverage’, ‘Stimulus’ and ‘Checker’. Regardless of the tools or methodology, any verification environment consists of these three parts.
Coverage addresses the fundamental question of how complete the verification is. This being an NP complete problem can never be complete, theoretically. However, in reality, setting the ‘denominator’ of the ratio – ‘covered vs planned’, becomes an intellectual exercise. Certain practices and pitfalls will be covered in detail in the ‘coverage/testplan’ topic.
Stimulus can be deterministic, but can explode very quickly. Say, a 2 input functional block can be covered exhaustively in 4 cases. But, a 256 input block will require 1.1579….e+77 combinations – impractical! Even worse, sequential elements add a time dimension. Finding proper methods to select or prioritize ‘important’ or ‘high leverage’ stimulus is an intellectual exercise.
Checker or rather, not having a thorough checker will make the two other dimensions useless. One can have a ‘complete’ checker only if the definition of that the expected behavior is complete. Usually, this is covered by the Design or the Architecture Specification of the chip or the product. Proper interpretation and checking for completeness is an intellectual process.
Flow/Process
A typical flow involves the following steps/phases, interspersed with reviews (every phase should begin and end with reviews).
- Design/Specification study
- Coverage planning (traditionally known as test-plan development)
- Testbench design planning
- Setting up the Verification Environment – databases, bug tracker, templates, regression/simulation environment, debug process, indicator tracking
- Testbench implementation
- Coverage plan implementation
- Bringup (fresh RTL tested against fresh testbench)
- Feature coverage
- Coverage exploration
- Final Checklist
Verification Methods
Several methods can be employed to carry out pre-silicon verification. The primary being Simulation based, Emulation based and the Formal method. All have pros and cons and can be used to complement each other.
Simulation method utilizes dynamic simulation techniques used at different scopes such as ‘module or block’ level, ‘cluster’ level or the ‘full chip’ level. Lower granularity helps speed up simulations, catch basic bugs quickly. However, cluster or full chip environments help check the interactions between the blocks, which are a common source of bugs. However, at higher levels, simulation speed slows down, and one starts encountering ‘controllability/observability ’ problems.
Emulation, hardware acceleration, proto-typing allows testing at much higher speed, and possibly at-speed. Hardware/software co-verification, using real life devices can be accomplished using emulators. Primary advantage is to get large number of cycles needed reach certain states of the design, say testing thousands of HD video frames. Also, one can actually bring-up peripheral devices such as SATA hard drives, thus taking out any risk in the implementation. Downsides are the cost and debug ability.
Formal methods prove behavior of a certain section of the design to match a certain set of properties mathematically. Thus, it’s exhaustive and complete! However, there are logistical limitations to the current generation of tools such as design size, speed, and even then, coverage is still only as good as the property set. Defining the property set is an intellectual process.
Productivity
Productivity or the efficiency pertains to the processes, environment, tools, methods which can improve the verification cycle. The major costs are the ‘direct’ $ cost, human power cost and the cost in terms of total calendar time. Simulators, emulators, server farms, other tools contribute to the direct $ cost. Calendar time is the critical path, and accounts for the processes that cannot be temporally scaled.
Buy vs. brew is always an important decision, and comes across multiple times during a project. The maintenance cost of developing a tool in-house should not be ignored while making this decision.
The non-deterministic, open-ended nature of verification complicates resource planning too. Predicting the number of simulation licenses needed, server farm size/capacity and estimating the total cycles needed to flush out ‘all the bugs’ is an intellectual process.
Simulation license usage, cycles, build times, run-times need to be tracked before they can be improved. Scripts/tools to track these indicators should be planned for. In addition, the bug tracking (rate of opening/closing bugs, turnaround times etc), rate of test development, coverage improvement needs to be constantly measured for improvement.
Environment
Verification Environment is a key part of the verification project. This includes organizing the verification collateral, selecting an efficient and robust revision control system, work flow, automation to avoid manual mistakes and improve efficiency, integration of various pieces of verification and choosing an efficient platform for the entire team, including designers, to develop complex projects. And, in today’s global development community, an efficient environment is the key to success. Groups can be spread all over the world, but their development environment should be identical or seamless.
Discipline and attention to detail are extremely important. How many times have we heard or experienced cases where bugs have slipped through the cracks in spite of having a ‘test’ that should have caught them – just because the test was not run on the final version of the netlist, or the test was not a part of a certain regression list. These mistakes are expensive, to say the least. Adding mere stress and pressure on engineers doesn’t help either. A fool-proof process and a set of tools/scripts can mitigate these circumstances.
Verification Language
The biggest wars in the verification world are on this topic. The modeling language for creating test-bench and the tests is central to any verification strategy. Starting with a bit of a historical perspective, Verilog or VHDL have been traditionally used for verification along with design. The ‘behavioral’ constructs in these languages aid verification tasks. Even today, several companies/projects rely on verification strategies based entirely on Verilog or VHDL. Using Verilog or VHDL has its advantages; the language knowledge is universal, no special simulators are needed, and most of the EDA tools understand and support these languages. On the downside, these languages were primarily designed to describe digital circuits/logic. They don’t have powerful data structure constructs. Randomization support is very limited.
To circumvent these limitations, common powerful languages/scripts such as C and Perl have been introduced. C or Perl provide the language or programming power. However, they are still not specifically ‘verification languages’, and every project/company tend to have their own implementation of the methodology.
Over 10 years ago, a new breed of specialized languages known as ‘Hardware Verification Languages (HVL)’ came into existence, starting with Vera and Specman. Vera was born inside Sun Microsystems as an internal verification language/tool. This was spun off and eventually became part of Synopsys. Most recently, the industry trend has been towards SystemVerilog, in an effort to standardize on the language. HVLs provide strong Object Oriented Structure and advanced features for randomization and constraint solving. In addition, they have become platforms for myriad of verification functions such as coverage monitoring, assertions and so on.
Finally, the most recent development has been that of a methodology layer via libraries. VMM, OVM are the two main methodologies in the market today for System Verilog.
One could argue that everything that is provided by these higher level HVLs or the libraries can be implemented in Verilog or VHDL. True! But, firstly you get a tremendous boost in productivity as these libraries or languages provide a high number of pre-defined functions, constructs. Secondly, these libraries provide a framework that inherently makes the collateral reusable and efficient.
Random, Directed, Emulation?
Another contentious area for design and verification teams is to decide between random, constrained random or directed testing. Another dimension of this debate is the simulation vs. emulation decision. The pros and cons will be discussed in detail in a future article of this series.
Sometimes, it helps using analogues. For instance, how would one test out a car – would one use a test-track with ‘simulated’ skids, obstacles and other external environmental factors, or just drive 100K miles on an expressway. One could use this analogy further and even look at orthogonals – say, if the blinkers have been tested in the garage, is it necessary to test them while driving through winding hilly roads, or say, while driving through winding hilly roads while the outside temperature is sub zero and at night. This helps reduce the set of interesting test cases.
Reviews/Checklist
As mentioned earlier, discipline is very important in Verification. Verification is the final safety net prior to tape-out. Any hole can potentially cost millions. In addition to inserting checks and balances into the tools/scripts/processes, reviews and checklists have a significant role in any verification project.
For reviews, key things to understand are:
- Primary intent should be to solicit feedback from other team members. Thus, every attempt should be made to communicate the content clearly to the attendees. A good idea or a clarification or detection of an error can save lots of time, frustration downstream.
- Put as many figures, tables as possible. Avoid textual paragraphs. A picture is worth 1000 words!
- Example code review is highly recommended.
- Time should be used efficiently but one should not limit wall-clock times
Some of the useful review/checklists, other than the usual, are:
- Tape-out checklist. Some of the items are
- Waived coverage points or tests along with justification
- Waived bugs along with justification
- Uncovered planned items, if any, with justification/risk assessment
- ‘What-else’ checklist: Once all the planned verification activities are completed along with a satisfactory bug curve, a series of what else reviews are recommended. This is a free flowing brainstorming of what else can possibly be done to find the bugs. As we all know, all bugs can never be found – which also means there are always more bugs in the design to be found and these reviews can potentially lead to them.
- ‘Last set of bugs’ – (This category or the review needs a better name!): Towards the end, around the time the ‘what-else’ reviews are held, the last set of bugs uncovered should be reviewed or analyzed for the following:
- What found the problem – was it accidental?
- What caused the bug – did it exist all along or a recent event caused it?
- Could it have been caught earlier?
- What if it had slipped? Is there a workaround? This tells the severity of the bug
These questions and discussions usually give rise to a few more clues or ideas about how to look for the remaining residual bugs.
Modeling
Modeling is a very generic term. In the context of verification, this is used for a ‘model’ that describes the correct or golden behavior. Often times, this arises out of architectural exploration efforts.
Models can be developed in C (most common), MatLab, SystemC, SystemVerilog, and Verilog or for that matter, any language.
When used as a golden model for verification purposes, it is very important to consider the verification requirements as this can have a very high impact on the productivity or efficiency of the project. Most of the time, there are surprises, as the models are developed much before verification starts and/or by different groups. Avoid them by planning and collaborating ahead of time with the modeling team. Verifying these models independently is always an interesting and important problem.
Performance Verification
This being very important, requires explicit attention. Performance, as opposed to functional verification, can be tricky due to two things. Firstly, setting up cases or ‘checking mechanism’ is not covered by the traditional functional verification collateral – thus, it is more work and often comes as a surprise. Secondly, identifying, defining and quantifying performance metrics is non-trivial. For instance, if the specification identifies the startup time of, say, a product like an iPod to be less than 2 seconds, then one needs to identify functionality in the hardware that contributes to this delay and then use that number to check for correctness. Sometimes, ‘quality’ aspects are not clearly quantified – especially, the acceptable numbers. Video quality is a good example of this.
Other Verification Areas
While planning for verification, following areas or special cases need to be considered
- Handling clock domain crossings
- Simulation artifacts, for instance, code that could mask propagation of Xs
- Design rule violations that will not be caught during traditional simulation/emulation
- Check for potential errors introduced by the RTL -> GDS process
- Analog components, and their interface with the digital logic
- Process variations and the logic implemented to compensate, for instance, DLLs.
- Power simulations are going to be commonplace going forward
Example DUT
Let’s take an example DUT – a simple SOC. We will throw in some standard SOC components – a host processor, a co-processor (such as a DSP or some such computational element), internal buses for both control and high speed data transfers, a memory sub-system (DDRs, SRAMs), peripheral IO interfaces such as USB, PCIe, UART, and internal SOC control elements such as IO muxes, clock/power management unit, interrupt management unit.
The following block diagram illustrates our example. All the future topics in this series will be discussed in the context of this example.
Summary
Throughout this series, we will make an effort to identify the ‘art’ and ‘science’ involved in Chip verification, with practical tips or some of the best known practices. Through use of advanced techniques, tools, methods, discipline, best practices one can mitigate the non-deterministic nature of verification. These can be used for accurate scheduling, planning and a high quality execution of verification projects leading to successful first pass silicon. However, there is an ‘intellectual’ part of the process that still requires the best and the brightest minds. And, that’s where the satisfaction or the intellectual rewards lie. Verification accounts for 70% of the pre-silicon development efforts, according to some estimates – let’s not leave it to chance by quoting the famous ‘verification is an NP complete problem’ – it can be harnessed and this has been already demonstrated by several successful groups, companies.
In the next session, we will cover ‘test-plan/coverage plan’ topic in detail.
About the Author: Suhas Belgal has 17 plus years of experience in Chip Design, Emulation, Modeling and Verification, including 9 years as a Verification Manager. During these years, Suhas has worked for several multi-billion dollar companies such as Intel and LSI, various start ups, and co-founded a Verification Services company. Over the years, Suhas has played key roles several high profile design teams such as Pentium II, and successfully led several SoC chips to production. He has experience in a wide range of Verification Methods and tools, and has been a presenter and panel member at various conferences, including the DAC. He has a master’s degree in EE from University of Texas at Austin and a bachelor’s from VJTI, Mumbai.

This content has been licensed to PuneChips under a Creative Commons Attribution-Noncommercial 2.5 India License. Contact Suhas Belgal for details of how to attribute and re-use for non-commercial as well as commercial distribution.

