Electronic design is a process that is full of risks.
Does my design work? Did I include all of the
specified functionality? Am I ready to go to tapeout?
Does my software work? Did I meet my performance
goals? Did I miss any bugs that might result in
product recalls?
This is why we perform verification; verification
is the “insurance policy” of the design world.
Verification is all about eliminating the risks in your
design process, ensuring that your product will meet
specifications, ship without bugs, and ship on time.
However, the complexities of verification are driving
project teams to consider new technologies and
methodologies, changing their current (working)
verification environment and adding more risk.
We are trying to reduce one kind of risk by adding
another type of risk! This is a fundamental paradox
facing every project team today. How can you avoid
adding risks while trying to reduce risks?
New languages, new methodologies, and new
technologies each can introduce significant risks
to your projects. However, if they are coupled
properly and the right solution is chosen for the
right application and engineering specialist or team
of specialists, then project risks can be reduced
Starting on September 12, a new event is coming
to Silicon Valley that will help you learn how to
reduce the risk of enhancing your verification
environment and ultimately reducing the total risk
to your project plans.
CDNLive! Silicon Valley 2005 kicks off a new, global
series of technical conferences for Cadence users. This
premier technical conference is managed by a
steering committee of Cadence users. At CDNLive!,
you will have the opportunity to network with other
design and verification engineers who are addressing
and overcoming verification challenges such as
verification process management, achieving
functional coverage, and implementing assertionbased verification, mixed-language design
environments, and hardware/software verification.
You’ll also have the opportunity for face-to-face time
with Cadence R&D engineers and other technologists.
At CDNLive!, you’ll hear four luminary keynotes,
including Dr. Bernard Meyerson (IBM) and Mark
Edelstone (Morgan Stanley). You will be able to visit
with 50 exhibitors at the Designer Expo. And you will
see live demos of the latest Cadence technologies and
meet their developers at Technology Demo Night.
You will be able to choose from over 94 technical
paper/panel presentations from your peers and
Cadence technologists, including:
• Methodology for integrated use of Cadence tools
for simulation, coverage, and analysis (QLogic)
• Verification for a GHz DSP — Experience designing
an ERM-compliant environment (TI)
• Formal analysis using Incisive Formal Verifier for PCI
Express validation (ATI)
• In-circuit emulation techniques for pre-silicon
verification (Freescale)
• Dynamic assertion-based verification using PSL and
OVL (Philips)
• Effective modeling techniques for formal verification
of interrupt-controllers (TI)
• Achieving vertical reuse in SoC verification (Cisco)
• Bringing up the system-level e verification
environment for a multimillion gate FPGA in seven
days! (Cisco)
• Application of dynamic random sequencing in
augmenting constrained randomization (QLogic)
• End-to-end data-flow monitoring with linked
hierarchical transaction modeling for system
verification (QLogic)
• Benefits and methodology impact of adopting formal
property checking in an industrial project (ST)
• Verification methodology for OCP-based systems
• The use of a general control and communications
module in a SystemC testbench (LSI)
• Architectural analysis of an eDRAM arbitration
scheme using transaction-level modeling in SystemC
(Silicon and Software Systems)
And finally, there will be a special series of hands-on
tutorials to help you learn about the latest
innovations and capabilities. These will be taught by
Cadence technologists and will include three sessions
focused specifically on verification:
• Complete assertion-based verification (ABV) with the
Incisive® platform
• Deploying SystemVerilog with the Incisive platform
• Advanced verification from plan to closure
Register now for this exciting event to be held at the
Westin in Santa Clara on Sept 12–15 by going to
2 | Incisive Newsletter — August 2005
Incisive Newsletter — August 2005 | 2
As gate counts continue to swell at a rapid pace,
modern systems-on-chip (SoCs) increasingly are
integrating more design-for-testability (DFT)
capabilities. Test and diagnosis of complex integrated
circuits (ICs) will soon become the next bottleneck,
if, in fact, they have not already. With up to 30%
of a project’s cycle already being spent debugging
silicon, and typically 30–50% of total project costs
being spent on test, DFT quickly is becoming the next
wild card. As daunting a task as reining in all the
variables related to DFT infrastructure can seem, an
enormous opportunity awaits those ready to take up
the challenge.
Today, DFT usually is nothing more than a collection
of ad-hoc hardware put together by different people,
using different tools, with neither a common strategy
nor a vision of an end quality of result. The inability
to deliver a reliable test infrastructure inevitably
leads to missed market opportunity, increased
manufacturing costs, or even a product that is not
manufacturable. Instead, a carefully designed and
verified DFT scheme, reflecting coherent test intent
across the board, can be an excellent value
differentiator throughout the lifetime of a product.
This brief article discusses how to plan DFT verification against test intent, ensure compatibility with
standards and functional correctness, and create a
complete, methodical, and fully automated path from
specification to closure.
The foundation for systematic DFT verification is
a well-defined set of goals, supported by a
methodology developed to provide integrationoriented test methods for chip-level DFT, to enable
compatibility across different embedded cores, and
to incorporate high levels of reuse. A DFT-verification
plan must satisfy these three separate objectives:
Does the test infrastructure adhere to the test
strategy and specification set forth by the design
and test engineers? You must verify that the global
test intent is designed and implemented properly.
Does the test infrastructure comply with industry
standards for interoperability and universal
facilitation of access? This is crucial to ensure reuse
of hardware and software.
Are there functional-design issues with the DFT
resources? Although such resources may appear to
operate within the parameters of the first two points,
there could be logic bugs in the implementation.
Once the objectives are defined, the development of
a complete system-level DFT-verification plan should
follow these general steps:
1. Capture system-level test intent and translate it into
an executable plan
test plan based on heterogeneous core-level DFT
schemes from different vendors.
2. Integrate heterogeneous core-level DFT plans into
the system-level DFT plan
Finally, this methodology also needs to apply to
internal engineering teams delivering design IP
for integration. Such teams have different skills and
management styles, and can operate in different
geographies or business units. They too, must
understand the need to plan DFT verification and
provide the necessary components to enable this
3. Provide a completely automated path from
specification to closure
4. Provide quality-of-result metrics
5. Provide total-progress metrics
6. Integrate the DFT plan into the IC-level verification
In order to build a successful DFT-verification plan,
you first must capture system-level test intent.
During this step, test engineers need to work closely
with verification engineers to ensure that the plan
includes all the aspects of test that must be available
in the final product. The global test-access mechanism
(TAM) is the primary element at this point; however,
other elements can come into play, such as top-level
DFT features, integration to board-level testability,
or hardware/software interfacing and information
Management must also be involved so that they gain
an understanding of the implications and trade-offs
of building reliable DFT. This will ensure total visibility
and resolve contention for resources further down
the road.
The preferable way to deliver the system-level test
intent description is in executable form. The global
verification plan must leave no room for doubt or
misinterpretation. Furthermore, it needs to provide
a solid basis for automation of subsequent steps
down to design closure.
Having a) captured the high-level test intent in an
executable plan and b) integrated separate core-level
DFT schemes, verification and test engineers are now
empowered to drive their processes more effectively.
The result is a fully automated path from plan to
closure for DFT verification, ensuring:
• Completeness — The verification plan includes a
section on all DFT features and their specifics
• Intent — The verification scope has been defined
early in the process by experts and with complete
• Uniformity — Disparate test strategies can now be
driven by a single process
During this stage, engineers should seek and
incorporate various elements that will be used as
building blocks to implement the verification strategy
according to the plan. Such elements can include:
• Verification IP, used to run verification tests on
individual DFT features
• Test-information models, used to exchange
information with other tools and processes for
enhanced automation
Bridging the gap between the ad-hoc world of
spurious DFT resources and planned system-level
DFT is not a trivial task. Individual intellectual
property (IP) vendors’ strategies for testability can
vary significantly in terms of quality, coverage, and/or
support deliverables.
In this phase of DFT-verification planning, it is
important to work closely with vendors to align
DFT strategies as closely as possible, and to enforce
quality metrics. Optimally, vendors should work with
their customers’ test engineers to design pluggable
DFT schemes and plans.
By capturing core-level test intent in an executable
plan, and including it in the deliverables, IP vendors
can provide a new added value to their customers.
Such executable plans can then flow into the IC
system-level test plan (see Figure 1). This is key to
unlocking the paradox of driving a uniform SoC-level
4 | Incisive Newsletter — August 2005
System DfT
verification plan
verification plan(s)
(e.g., JTAG)
System DfT
verification plan
Embedded core
verification plan(s)
verification plan(s)
(e.g., IEEE 1500)
DfT features
verification plan(s)
DfT features
verification plan(s)
Figure 1: Simulation acceleration
Incisive Newsletter — August 2005 | 4
But how can you conclude that the DFT-verification
plan guarantees the necessary quality for reliable
DFT? In order to address this inherent unpredictability, you must set expectations for quantifying
results as part of your plan.
Quality-of-result metrics are measurable targets that
can be entered as verification-plan attributes early
in the planning phase. Such targets must result from
collaboration among verification, design, and test
engineers in order to ensure that all the aspects of
the task at hand are addressed. They can include
functional coverage metrics, such as the number of
different instructions loaded in any given JTAG TAP,
seed patterns used in automatic test-patern
generators (ATPGs), or isolation behavior of
embedded-core scan cells. All these metrics should
be associated neatly with a respective section of
the executable test plan.
It is also a good idea to assign priorities or weights
to different test-plan sections based on these metrics.
For example, what is the purpose of exhaustively
testing a built-in self-test (BIST) controller connected to
a JTAG TAP if the TAP is not thoroughly verified first?
Quality-of-result metrics can be a guide to
understanding and reporting progress, and can
identify critical paths. Once the project is underway,
it is difficult to track the progress of specific tasks
and the implications of prioritization. Tracking
quality-of-result progress provides a way of
correlating real DFT-verification progress while
simultaneously enabling total visibility across the
different teams. This way, test engineers can know
at all times the progress of the verification of DFT
across the board and use this information to drive
other processes, such as test-vector generation or
early fault analysis. They can also use this information
to raise management awareness of issues that may
arise during the design process.
Finally, a methodical DFT-verification plan must be
integrated into the system-level IC-verification plan.
This way, DFT quality-of-result metrics can be factored
into chip-level metrics for total-quality management
for closure. System-level planning should incorporate
the processes and methodologies of effective DFT
verification. This enhances the allocation of necessary
resources, and ensurres that expert knowledge is
5 | Incisive Newsletter — August 2005
Furthermore, a DFT-verification plan can help bridge
the cultural gap that today divides test engineers
from the rest of the design-cycle, and can advocate
cooperation between the two main bottlenecks of
today’s SoC design: verification and test.
There are a variety of motivating factors for
planning and executing proper DFT verification.
The investment made during the design cycle can
be leveraged to reap a series of long-term benefits,
Increased visibility into test intent across development
teams results in better integration of the design and
test engineering processes, skills, and cultures.
Methodical plans to verify test infrastructures create
a well-defined process for incorporating input from
test engineers into the development cycle. Test
engineers participate in creating the global test
specification, helping to qualify vendors based on
DFT-quality metrics, and/or prioritizing verification
tasks against target results. This enhanced visibility
also results in the reverse benefit of better
communication and information feedback from
manufacturing/test back to design in order to close
the design-for-manufacturability (DFM) loop.
As semiconductor processes move deeper into
nanometer scales, the cost of fabrication and test is
exploding. Fabrication facility costs at 65nm are
expected to hit $4 billion. If the current test-capitalper-transistor ratio persists (it has been flat for 20
years), in several years the general cost of test will
exceed fabrication cost. Associated low yield also
increases the number of test cycles required to
determine the quality of silicon.
DFT-verification planning aims to provide a reliable
path from test intent to quality of results. Adding
quality and efficiency to test planning leads to better
testing strategies aimed at locating real silicon faults
while minimizing costly over-testing and/or excessive
vector sets. Testing on advanced automated test
equipment (ATE) at 90nm can exceed $0.10/second
per unit; for a batch of 1 million at 100% yield, that’s
$100,000/second. Improving the DFT-planning process
can help companies make efficient use of this
expensive tester time, or even help them to switch to
cheaper ATE resources by partitioning more test
resources on-chip.
Incisive Newsletter — August 2005 | 5
In a demanding consumer-driven electronics market,
execution of a product strategy leaves no room for
error. Re-spins simply are not an option when
targeting 90nm (or below) process technologies.
Lengthy silicon debugging, manufacturing-test time,
low yield, and lack of diagnosability substantially
impact the time-to-market window. Proper planning
for DFT verification results in increased design- and
test- schedule predictability and repeatability, better
process automation, and enhanced efficiency with a
direct, positive effect on time-to-market.
Designing verification IP that can be invoked directly
and automatically from the plan results in additional,
significant time savings. Such IP can include complete
environments capable of generating test vectors,
checking DFT state, and measuring the extent of
exercise of the test infrastructure. Such IP should
be designed only once for standard components
(e.g. JTAG), and then enriched with feature-specific
libraries for customization. Time invested up front
results in overall project-time savings by ensuring
that DFT is designed and verified only once. Such
savings are optimized from project to project due
to complete and calculated reuse.
With third-party IP playing such an integral role in
today’s SoCs, DFT-verification planning can be used
to increase levels of process integration and
automation with strategic vendors. Furthermore,
DFT-quality metrics can be incorporated in new
vendor assessment by grading the vendor’s test
strategy, DFT implementation, DFT reuse, and
applicability targets. Qualification metrics offer
advanced vendors incentives to provide complete,
executable verification plans, IP, and test information
models for enhanced integration into their customer’s
test infrastructure.
Large and complex test infrastructures are a reality
in today’s dense SoCs, which comprise a multitude
of diverse DFT resources. If companies are to meet
their manufacturing-cost and time-to-market
demands, they will need to ensure that such test
infrastructures are well verified for specification,
compliance, and functionality. At the foundation
of the solution lies a detailed executable plan that
can be used to provide an automated path from
specification to closure with predictable quality
of result.
How are you planning to verify all that DFT?
6 | Incisive Newsletter — August 2005
Incisive Newsletter — August 2005 | 6
The Cadence® Incisive® Unified Simulator, as well
as the NC-Verilog® simulator and the NC-VHDL
simulator, includes a utility called nchelp that you
can use to get information on warning and error
messages. This utility displays extended help for a
specified warning or error.
the following constructs and rerun
The syntax is as follows:
LIB_MAP ( <path> => <lib> [, ...] ) //
hdl.var mapping
% nchelp tool_name message_code
where tool_name is the tool that generated the error,
and message_code is the mnemonic for the warning
or error. You can enter the message_code argument
in lowercase or in uppercase. For example, suppose
that you are compiling your Verilog® source files with
ncvlog, and the following error message is generated:
ncvlog: *F,NOWORK (test.v,2|5): no
working library defined.
To get extended help on this message, use the
following command:
% nchelp ncvlog NOWORK
% nchelp ncvlog nowork
This command displays information about the
message, the cause of the error, and ways to fix
the error.
ncvlog/nowork =
A working library must be associated
with every design unit parsed by
ncvlog. No working library was defined
for the specified file and position.
Specify a working library using one of
‘worklib <lib> // compiler directive
–WORK <lib> // command-line option
WORK <lib> // hdl.var variable
Here is another example of using nchelp. In this
case, nchelp is used to display extended help for
the elaborator DLWTLK error.
% nchelp ncelab DLWTLK
ncelab/DLWTLK =
The library specified in the message
requires that the calling process
wait for a lock before the data can
be accessed. Another process may be
accessing the library at the same
time. The process will wait for up
to an hour attempting to get the
lock, and then will terminate with
another error if the lock could
never be achieved.
Some possible causes include:
– Another process may have locked
the library. See the documentation
of the -lockinfo option of the ncls
utility for how to examine the lock.
– A suspended process may have left
a lock on the library. Use the unlock option of the ncpack utility.
– Locking may currently be broken
either on the network or between the
two machines involved. Use the lockcheck option of the nchelp utility
to check for these states.
– If you are using the +ncuid option on
ncverilog to enable parallel simulation,
ensure that each invocation of ncverilog
uses a distinct <ncuid_name>.
– The library database may have been
corrupted. In this case, delete and
rebuild the library database.
To get help on the nchelp utility, use the –help
option. This option displays a list of the nchelp
command-line options.
% nchelp -help
The following command-line options are especially
• –cdslib — Displays information about the contents of
cds.lib files. This can help you identify errors and any
incorrect settings contained within your cds.lib files.
• –hdlvar — Displays information about the contents
of hdl.var files. This can help you identify incorrect
settings that may be contained within your hdl.var
In addition to providing extended help for messages
generated by the compiler, elaborator, and simulator,
nchelp can provide help for messages generated
by other tools, such as NCLaunch or HAL, and for
messages generated by various NC utilities such as
ncupdate, ncls, ncprotect, and so on. To display a list
of all tools for which extended help is available, use
the –tools option:
% nchelp -tools
For complete information on the nchelp utility, see
the section “nchelp” in the chapter called “Utilities”
in the NC-Verilog Simulator Help or the NC-VHDL
Simulator Help Documents.
8 | Incisive Newsletter — August 2005
Incisive Newsletter — August 2005 | 8
Having had some experience with assertion libraries
in the past, I was interested to see what Cadence
had to offer when it started touting its integration
of Property Specification Language (PSL) hooks and
other proprietary library components in NC-Verilog®
and the Incisive® verification library. In order to
maintain tool independence and to minimize cost,
I was especially interested in seeing how NC-Verilog
handled standard PSL and how easy it would be to
invoke these features with my current Verilog® design
and testbench. When I was told that I simply had to
invoke the “+assert” plusarg on the command line,
it was not hard to convince me to just try it out —
this sounded like a low-risk proposition for both the
design and for my time.
During early tests of the software, I had some bumps
as I either discovered or duplicated a couple of tool
bugs. But these were resolved with bug fixes and
did not require me to change my RTL or PSL code.
I was then able to move on easily from using a single
simulation test case to a running a full regression on
my design. The full regression produced a few error
messages triggered by the simulator to indicate that
one of the assertion specifications had been violated
while exercising a certain corner case
in the design. This error pointed to a problem that
definitely would have resulted in a silicon revision.
By looking at the assertion error message, I was able
to identify the time the error occurred in simulation
and to observe it with a waveform viewer.
After working out a few tool bugs with the Cadence
team at the factory and a couple of applications
engineers, I did, in fact, see that using PSL inside
my Verilog code with NC-Verilog was basically as
advertised, and no more complicated than turning
on the “+assert” plusarg. And there were other
unexpected benefits as well — I found a bug in a
design that was thought to have been verified fully.
The PSL code that found the error was very simple
to write, but provided a very significant check:
For my specific test case, I implemented 40 PSL
assertions to cover two main types of verification:
1) interactions at the major interfaces to my 400Kgate block, and 2) complicated and hard-to-verify
internal areas of the code. Many of these assertions
were new, but many were also translated from
another vendor’s proprietary library into the PSL
// psl max_req_size: assert never
// (Sop && !Write && (ByteCount >
// @(posedge clk);
The bug causing the error was related to a
complicated transfer-size calculation that involved
over 200 lines of code and was not correct in some
corner cases. With many different requirements on
the design, it was difficult initially to write the RTL
code to cover all possible corner cases. Writing the
assertion specifications in PSL and then running them
through a dynamic regression to find errors provided
an automated way for the simulator to flag the error.
I did not have to write additional long tests to cover
high-level functions; I just had to specify the
properties of the small area of the logic of greatest
concern to me and enable the assertions during these
tests. This was much quicker than spending hours
randomly poring over Verilog code looking for an
issue that was not flagged as an error during
simulation and was not known to exist.
Furthermore, using PSL as a means to check the
design through dynamic simulation enabled me to
attack verification from a different angle. The bug
that was found should have been detected already
through checks that were required in our bus
function model (BFM). But the BFM checks also were
not being done properly. So the assertions not only
found a bug in our design, but also in our dynamicsimulation environment.
The failing test case was well-suited to assertionbased verification (ABV) techniques because it
selectively targeted a concise, yet complicated,
section of the code. The PSL code was not used
simply as a tool to re-write the RTL code; in fact, it
would have been very difficult to re-write this code
in PSL. Instead, the PSL assertions targeted only a
very specific, modular part of the whole calculation.
The assertions were also much less complicated and
time consuming to write than the original RTL code.
Another thing that we looked at during this
evaluation was the capability of Incisive to infer
assertions related to synthesis pragmas
(parallel_case/full_case) automatically during
dynamic simulation. Initially, there were some
errors reported from these automatically inferred
assertions. After looking into them, we found that
these errors were not real, but only related to event
ordering within the asynchronous logic, and that by
the time the sequential logic observed these values
they all had met the criteria assumed by the synthesis
pragmas. Based on this observation, I requested that
Cadence provide a way to associate each of these
automatic assertions to a clock. Cadence agreed, and
in a subsequent release provided a way to declare
a clock for synchronous evaluation of these pragmas.
This cleared up the false errors reported by the
pragma assertions.
The end result of this evaluation is that we became
more familiar with the capabilities of the Incisive
verification tools with very little time investment,
we improved our design, and we gained a better
understanding of how to verify designs in the future.
The payback for generating assertions and running
them in dynamic simulation was much greater than
the time investment it took to generate them. In
the future, we will look more into using assertions
for static verification as the tools mature and
capacity improves. Also, during this evaluation we
experimented some with the Cadence Incisive
Assertion Library (IAL) components and we may
expand our use of those in the future.
10 | Incisive Newsletter — August 2005
Incisive Newsletter — August 2005 | 10
For better or for worse, the engineering community,
the press, and the EDA vendors themselves have
classified the world of simulation acceleration and
emulation incorrectly into two camps: fieldprogrammable gate-arrays (FPGAs), and applicationspecific integrated circuits (ASICs). Advocates in each
camp declare the same tired facts. FPGAs take
forever to compile and have internal-timing
problems. ASICs are power hungry and require
longer development time.
When it comes to choosing an emulation system,
the underlying technology does contribute to the
characteristics of the system, but far too much time
is spent on low-level technology details and not
enough time on how emulation gets the verification
job done by providing high performance and high
What engineers intend to say when they discuss
“FPGA vs. ASIC” is “prototyping vs. simulation
acceleration and emulation.” To add to the
confusion, some semiconductor companies even
call their internally developed FPGA prototype an
This paper discusses the factors that are important
when evaluating simulation acceleration and
emulation and the different use-modes and
applications for acceleration and emulation. With
the acquisition of Verisity, Cadence now offers two
of the most successful acceleration/emulation product
lines in the market. From this unique position,
Cadence can best serve customers by evaluating real
verification needs and recommending products and
technologies that enable improved verification
through the use of simulation acceleration and
Before the important aspects and use-modes of
emulation are presented, some definitions are
needed. Four distinct methods are used commonly
for the execution of hardware design and
• Logic simulation
• Simulation acceleration
• Emulation
• Prototyping
Each hardware execution method has associated
with it specific debugging techniques; each has its
own set of benefits and limitations. The execution
time for these methods range from the slowest (with
the most thorough debugging) to the fastest (with
less debugging). For the purpose of this paper the
following definitions are used:
Software simulation refers to an event-driven
logic simulator that operates by propagating
input changes through a design until a steady-state
condition is reached. Software simulators run on
workstations and use languages such as Verilog®,
VHDL, SystemC, SystemVerilog, and e to describe the
design and verification environment. All hardware
and verification engineers use logic simulation to
verify designs.
Simulation acceleration refers to the process of
mapping the synthesizable portion of the design
into a hardware platform specifically designed to
increase performance by evaluating the HDL
constructs in parallel. The remaining portions of the
simulation are not mapped into hardware, but run in
a software simulator. The software simulator works in
conjunction with the hardware platform to exchange
simulation data. Removing most of the simulation
events from the software simulator and evaluating
them in parallel using the hardware increases
performance. The final performance is determined by
the percentage of the simulation that is left running
in software, the number of I/O signals communicating
between the workstation and the hardware engine,
and the communication channel latency and
bandwidth. A simple representation of simulation
acceleration is shown in Figure 1.
Emulation refers to the process of mapping an entire
design into a hardware platform designed to further
increase performance. There is no constant
connection to the workstation during execution,
and the hardware platform receives no input from
the workstation. By eliminating the connection to
the workstation, the hardware platform runs at its
full speed and does not need to wait for any
communication. A basic emulation example is
shown in Figure 2. By definition, all aspects of the
verification environment required to verify the
design are placed into the hardware. Historically, this
mode has restricted coding styles to the synthesizable
subset of Verilog® and VHDL code, so this mode is
also called embedded testbench or synthesizable
testbench (STB). Even without a constant connection
to the workstation, some hardware platforms allow
on-demand workstation access for activities such as
loading new memory data from a file, printing
messages, or writing assertion-failure data to the
workstation screen to indicate progress or problems.
workstation access
Verification environment
Verification environment
Hardware engine
Figure 1: Simulation acceleration
During simulation acceleration, the workstation
executes most of the behavioral code and the
hardware engine executes the synthesizable
code. Within simulation acceleration there are two
modes of operation, signal-based acceleration and
transaction-based acceleration. Signal-based
acceleration (SBA) exchanges individual signal
values back and forth between the workstation and
hardware platform. Signal synchronization is required
on every clock cycle. SBA is required for verification
environments that utilize behavioral-verification
models to drive and sample the design interfaces.
Transaction-based acceleration (TBA) exchanges only
high-level transaction data between the workstation
and the hardware platform at less-frequent intervals.
TBA splits the verification environment into two
parts: the low-level state machines that control the
design interfaces on every clock, and the high-level
generation and checking that occurs less frequently.
TBA implements the low-level functionality in
hardware and the high-level functionality on the
workstation. TBA increases performance by requiring
less-frequent synchronization and offers the option
to buffer transactions to further increase
Hardware engine
Figure 2: Emulation
In-circuit emulation (ICE) refers to the use of external
hardware coupled to a hardware platform for the
purpose of providing a more realistic environment for
the design being verified. This hardware commonly
takes the form of circuit boards, sometimes called
target boards or a target system, and test equipment
cabled into the hardware platform. Emulation
without the use of any target system is defined as
targetless emulation. A representation of ICE is
shown in Figure 3. ICE can be performed in two
modes of operation. The mode in which the emulator
provides the clocks to the target system is referred to
as a static target. When running ICE with static
targets, it is possible to stop and start the emulation
system, usually for the purpose of debugging. The
mode in which the target system provides the clocks
Target boards
workstation access
Verification environment
Hardware engine
Figure 3: In-circuit emulation
12 | Incisive Newsletter — August 2005
Incisive Newsletter — August 2005 | 12
to the emulator is referred to as a dynamic target.
When using dynamic targets, there is no way to stop
the emulator, because it must keep up with the clocks
supplied by the target system. Dynamic targets
require special considerations for proper operation
and debugging.
Hardware prototype refers to the construction of
custom hardware or the use of reusable hardware
(breadboard) to construct a hardware representation
of the system. A prototype is a representation of the
final system that can be constructed faster and made
available sooner than the actual product. This speed
is achieved by making tradeoffs in product requirements such as performance and packaging. A
common path to a prototype is to save time by
substituting programmable logic for ASICs. Since
prototypes are usually built using FPGAs, they are
often confused with and compared to emulation
systems that also use FPGA technology. As we will
see, there are very few similarities between how
FPGAs are used in prototypes and in emulation.
Prototypes use a conventional FPGA-synthesis flow
to map a design onto the target FPGA technology.
Partitioning and timing issues are left to the
prototype engineer to solve and require careful
planning from ASIC designers at the beginning of
the design process.
Now that the basic definitions of acceleration and
emulation are clear, it’s important to note that while
some products in the market perform only simulation
acceleration and some perform only emulation, more
and more, the trend is for products to do both. The
systems that perform both simulation acceleration
and emulation usually are better at one or the other
mode of operation, regardless of what the marketing
brochure says. In the rest of the paper, the general
term “emulator” is sometimes used for systems that
do both simulation acceleration and emulation, but
that “specalize” in emulation.
For engineers evaluating simulation acceleration
and emulation, there are many important factors
to consider. The main motivation for using either
simulation acceleration and/or emulation is always
the significant performance increases that are
possible over that of a logic simulator alone; but
performance is not the only criterion to consider.
This section discusses a baseline set of features that
are must-haves for all users. Without these features,
a product is probably not useful.
Automated compile process: Emulation offers a
completely automated compile flow. The process
is similar to compiling a design for logic simulation,
but involves more steps to map the design onto the
hardware architecture and prepare the database to
be downloaded into the hardware. The user does not
13 | Incisive Newsletter — August 2005
need to learn about the details of the hardware, nor
how to partition and route the design across boards
and chips. There is no need to learn about timing
issues inside the system.
Simulation-like debug: Emulation provides many
advanced debugging techniques to find and fix
problems as quickly as possible. When needed, 100%
visibility for all time is available, as well as many
other modes of debugging, including both interactive
during execution and using post-processing methods
when execution is complete.
Multi-user system: Project teams surely will want
to verify entire chips or multi-chip systems using
emulation. These tasks require large capacity. At
other times, emulation is useful for smaller subsystems of the design, and capacity can be shared
with other projects. To get the most value from
emulation, high capacity should be available when
needed, but also should be able to be split into
smaller pieces and shared among engineers.
Support for complex clocking: Another benefit of
emulation is its ability to handle complex clocking
in a design. Designs today use every trick possible
to manage performance and power requirements.
Most of these techniques do not map well onto
FPGA architectures, which support primarily a
single-clock-edge, synchronous design style.
Easy memory mapping: Emulation provides the
ability to map memories described in Verilog and
VHDL automatically to the hardware with little or
no user intervention.
Intellectual property (IP) integration: Many designs
contain hard- and/or soft-IP cores. Emulation
provides the ability to integrate processor models
so emulation can be used to execute embedded
Emulation products are no different than any
other product: no one product can be the best
at everything. In every design process there are
constraints such as cost, power, performance, and
system size. Engineers make tradeoffs to determine
which factors are important, what to optimize, and
what to compromise. Understanding the keys to
emulation success in each mode of operation will
lead to the best solution for the situation.
The key to success for signal-based acceleration is
easy migration from logic simulation. SBA users want
to turn on performance with minimal effort. System
characteristics that make an easy migration possible
Incisive Newsletter — August 2005 | 13
• Easy migration path from simulation. This includes
partitioning of the code that goes to the hardware
vs. the code that goes to the workstation and an
automatic connection between these two partitions
• The closer the evaluation algorithms are to the logic
simulator’s algorithms, the better
• The ability to swap dynamically the execution image
from the logic simulator into hardware and back
again at any time
• Initialization flexibility, such as the ability to use
Verilog initial statements, procedureal language
interface (PLI) code, and execute initialization
sequences in logic simulation before starting the
• Dynamic-compare for debugging when results don’t
match logic-simulation results
The key to success for transaction-based acceleration
is the verification environment and the verification IP
(VIP). TBA offers higher performance compared to
SBA, but requires some additional planning up front.
System characteristics that make TBA successful
• Easy-to-use TBA infrastructure for VIP creation
• High-bandwidth, low-latency channel between the
emulator and workstation that enables the emulator
to run as the master and to interrupt the workstation
dynamically when needed
• Library of TBA-ready VIP for popular protocols that
runs congruently in the logic simulator and on the
Synthesizable testbench is all about performance.
In addition to runtime speed, performance also
comprises the overall turnaround time, including
compile time, waveform generation, and debugging.
For those designs that don’t have multiple-complex
interfaces connected to the design, STB provides the
highest level of performance possible. In addition,
STB is used commonly for software development
where performance is critical. The ability to interrupt
the workstation infrequently in order to execute
some behavioral functions, such as loading memory
or printing messages, makes STB easier to use.
The key to ICE is the availability of hardware
interfaces and target boards. ICE is always tricky
because the emulator is running much more slowly
than the real system will run.
• Collection of ICE-ready vertical solutions for popular
protocols to minimize custom hardware
14 | Incisive Newsletter — August 2005
• Support for all types of ICE targets
– Static targets that enable the provided clock to be
slow or even stopped
– Dynamic targets where the target board needs a
constant or minimum clock frequency to operate
correctly and stopping the clock would be fatal
– Targets that drive a clock into the emulator
• ICE-friendly debugging
– Connections to software debugging environments
– Large trace depths for debugging dynamic targets
The keys to emulation discussed above help
determine the most important modes of operation
and the solutions that best fit a set of verification
projects. The next section describes the characteristics
of the Cadence® Palladium® and Xtreme products,
along with the strengths of each.
Cadence Palladium systems are built using ASICs
designed specifically for emulation. Each ASIC in
a Palladium system implements hundreds of programmable Boolean processors for logic evaluation.
Each ASIC is combined with memory on a ceramic
multi-chip module (MCM) for modeling design
memory and for debugging. The Palladium II custom
processor is fabricated using IBM CU08, eight-layer
copper-process technology. Each board in the system
contains a number of interconnected MCMs to
increase logic and memory capacity.
Palladium uses the custom-processor array to
implement a statically scheduled, cycle-based
evaluation algorithm. Each processor is capable of
implementing any four-input logic function using
any results of previous calculations. During
compilation, the design is flattened and broken
into four-input logic functions. Each calculation is
assigned to a specific processor during a specific
time slot of the emulation cycle. The emulation
cycle comprises the number of time slots, or steps,
required for all the processors to evaluate the logic
of the design. Typically, the number of steps required
for a large design is somewhere between 125 and
320 steps. Performance can be improved through
compiler enhancements that result in a shorter
emulation cycle. The other benefit of the evaluation
algorithm used by Palladidum is its ability to achieve
higher performance by adding more processors.
The availability of processors enables greater parallelization of the computations, and has a direct impact
on performance. Because the scheduling of evaluations is fixed, performance doesn’t depend on
design activity or design gate-count.
Incisive Newsletter — August 2005 | 14
The technology used in Palladium gives it many
desirable characteristics for emulation:
Highest performance: Since custom processors are
designed for high-performance emulation right from
the start, using the most advanced semiconductor
processes, the chips are clocked at hundreds of MHz,
resulting in emulation speeds exceeding 1 MHz
without any special advance design planning.
Fastest compile: The processor-based architecture
leads to compile speeds of 10–30 million gates-perhour on a single workstation.
Best debugging: The custom hardware of Palladium
is also designed for fast debugging. It supports
multiple modes of debugging depending on how
much information the user would like to see. It also
has dedicated memory and high-bandwidth channels
to access debugging data.
Large memory: The custom hardware and the
advanced technology enable very large on-chip
memory with fast access time. This capability enables
users to incorporate large on-chip and on-board
memories with full visibility into the memory while
maintaining high performance. Designers can use
this memory for test vectors or storing embedded
software code.
Highest capacity: The ability to connect multiple
units without changing the run-time frequency
and bring-up time of the overall environment
makes the Palladium family scalable, and provides
an environment that supports 2–128 million gates
with Palladium and 5–256 million gates with
Palladium II. This is the highest emulation capacity
in the market.
Palladium offers a typical speed in the range of
600 kHz–1 MHz. (with many designs running over
1 MHz.). It is an ideal system for ICE and STB. A
deterministic algorithm and precise control over I/O
timing enable it to excel in ICE applications where
constant speed is critical. Boundary scan (i.e., JTAG)
is an example of an application that requires constant
clock rates, the ability for the emulator to receive a
clock signal from an external target board, and the
need for hardware debugging when the target
system cannot be stopped. Decades of ICE experience
translate into the most available and reliable off-theshelf ICE solutions for major markets including
networking, multimedia, and wireless. Palladium
can do simulation acceleration (including SBA and
TBA), but lacks some of the advanced features such
as bi-directional simulation swapping and dynamiccompare with the logic simulator. Palladium also
supports assertions in all modes of operation,
including ICE with dynamic target.
15 | Incisive Newsletter — August 2005
Palladium supports up to 32 users, 61,440 in-circuit
I/O signals, and is a true enterprise-class verification
system with the highest capacity and highest
performance. It can be used by multiple engineers
and multiple projects in a company from anywhere
in the world.
Cadence Xtreme systems are built using
reconfigurable computing co-processors (RCCs)
implemented in programmable logic using highdensity commercial FPGAs. Each FPGA implements
hundreds of reconfigurable processors of different
types based on the logic in the design. In addition to
logic evaluation, each FPGA also contains internal
memory that can be used to model design memory.
Each board contains a number of FPGAs combined
with additional static and dynamic memory to
increase logic and memory capacity.
Although Xtreme uses commercial FPGA devices, it
does not operate like other FPGA-based systems.
When considering how to use FPGAs, most engineers
immediately think of synthesizing their design into
a netlist and mapping gate-for-gate, wire-for-wire
onto the array of FPGAs and trying to manage all
the timing issues associated with internal, FPGA
timing and routing between FPGAs. This description
in no way resembles how RCC works. The best way to
think of RCC technology is to think about how a logic
simulator works. A simulator uses an event-based
algorithm that keeps track of signal changes as
events. It schedules new events by putting them into
an event queue. At any particular simulation time,
the algorithm updates new values (based on previous
events) and schedule new events for some future
time. As simulation time advances, only the values
that are changing are updated and no extra work is
done to compute values that don’t change. This
event-based algorithm is the concept behind RCC.
It uses patented, dynamic-event signaling to
implement a simulation-like algorithm. The difference is that the events are all executing in hardware.
Messages are sent between the co-processors as
events occur and updates are required in other coprocessors. As with a simulator, new values are only
computed when necessary, based on design activity.
Dynamic-event signaling leads to a system that has
none of the timing issues associated with FPGAs, and
a product that runs and feels like a logic simulator.
The technology used in Xtreme gives it many
desirable characteristics for simulation acceleration
and emulation:
Low power: FPGAs consume very little power and the
dynamic-event signaling algorithm used in Xtreme
translates into good emulation performance with
lower clock rates inside the system.
Incisive Newsletter — August 2005 | 15
Small form factor: Low power combined with highdensity FPGAs translates into a 50-million-gate system
(with 60–65% utilization) that fits in a desktop form
factor, or a 6U rack mounted chassis. Xtreme is a
portable system that is light, easy to transport, and is
often placed in an engineer’s cubicle instead of a lab
or computer room.
Frequent capacity upgrades and cost reductions: Since
emulation capacity and cost follow the mainstreamFPGA technology curve, it is possible to introduce
very rapidly new systems that increase capacity and/or
lower cost with minimal effort.
Xtreme offers a typical speed in the range of 150–300
kHz and is an ideal system for simulation acceleration
and TBA. It excels at targetless emulation and in any
environment with a mix of SBA, TBA, and STB.
Xtreme offers better performance than classic
simulation acceleration systems that enforce the
workstation as master. Xtreme is also an ideal system
for accelerating e verification environments using
SpeXtreme, a product that utilizes the behavioralprocessing features of Xtreme to increase overall
verification performance. Xtreme can handle ICE,
but lacks many of the advanced ICE features that
are available in Palladium, such as dynamic-target
support and robust ICE-debugging methods.
The Xtreme family supports up to 12 users, 4,656
in-circuit I/O signals, and is a design-team-class
verification system that can be used by multiple
engineers working on a project. It offers excellent
price vs. performance.
Although there is some overlap in capabilities, the
Xtreme and Palladium families of products excel at
different modes of operation, as shown in Figure 4.
Palladium family sweet spot
In this paper, we discussed that all products must
make trade-offs in deciding which parameters should
be optimized and which should be compromised.
With the Xtreme and Palladium families, Cadence is
in the ideal position to serve customers of all types.
Some users choose one of the product lines to use
throughout the phases of the project. Others utilize
both technologies, each one for different phases of
verification, or for different projects.
Palladium offers the highest performance and highest
capacity for emulation power-users with very large
designs. Palladium is the most comprehensive
solution for ICE and was designed with advanced ICE
users in mind.
Because of its simulation-like algorithm, Xtreme
excels at simulation acceleration and TBA and
provides higher performance than other acceleration solutions. It was designed with simulation
acceleration in mind, with concepts like simulation
swap and dynamic compare that make it run and
feel like a logic simulator.
Although the technologies behind Palladium and
Xtreme are different, both are processor-based
architectures that provide automated compile, short
bring-up times, and scalable multiuser capacity not
found in prototyping systems. Both systems provide
more design-turns-per-day than competing systems,
and each has a proven track record of success in the
marketplace. Together, these product lines are being
used by more than two-thirds of all simulation
acceleration and emulation users.
Engineers evaluating emulation need to understand
the keys to success presented here for the different
modes of operation and determine which solution is
the best fit for their verification challenges.
Xtreme family sweet spot
Verification performance
System-level verification
Chip-level verification
Block-level verification
Figure 4: Emulation strengths
16 | Incisive Newsletter — August 2005
Incisive Newsletter — August 2005 | 16