ASIC Design Flow
Step 1: Prepare an Requirement Specification
Step 2: Create an Micro-Architecture Document.
Step 3: RTL Design & Development of IP's
Step 4: Functional verification all the
IP's/Check whether the RTL is free from Linting
Errors/Analyze whether the
RTL is Synthesis friendly.
- Step 4a: Perform Cycle-based
verification(Functional) to verify the protocol behaviour of the RTL
- Step 4b: Perform Property
Checking , to verify the RTL implementation and the specification
understanding is matching.
Step 5: Prepare the Design
Constraints file (clock definitions(frequency/uncertainity/jitter),I/O delay
definitions, Output pad load definition, Design False/Multicycle-paths) to
perform Synthesis, usually called as an SDC synopsys_constraints, specific to
synopsys synthesis Tool (design-compiler)
Step 6: To Perform Synthesis for the IP, the
inputs to the tool are (library file(for which synthesis needs to be targeted
for, which has the functional/timing information available for the standard-cell
library and the wire-load models for the wires based on the fanout length of
the connectivity), RTL files and the Design Constraint files, So that the
Synthesis tool can perform the synthesis of the RTL files and map and optimize
to meet the design-constraints requirements. After performing synthesis, as a
part of the synthesis flow, need to build scan-chain connectivity based on the
DFT(Design for Test) requirement, the synthesis tool (Test-compiler), builds
the scan-chain.
Step 7: Check whether the Design is meeting
the requirements (Functional/Timing/Area/Power/DFT) after synthesis.
- Step 7a: Perform the
Netlist-level Power Analysis, to know whether the design is meeting the
power targets.
- Step 7b: Perform Gate-level
Simulation with the Synthesized Netlist to check whether the design is
meeting the functional requirements.
- Step 7c: Perform
Formal-verification between RTL vs Synthesized Netlist to confirm that the
synthesis Tool has not altered the functionality.( Tool: Formality )
- Step 7d: Perform STA(Static
Timing Analysis) with the SDF(Standard Delay Format) file and synthesized
netlist file, to check whether the Design is meeting the
timing-requirements.( Tool: PrimeTime)
- Step 7e: Perform Scan-Tracing ,
in the DFT tool, to check whether the scan-chain is built based on the DFT
requirement.
Step 8: Once the synthesis is
performed the synthesized netlist file(VHDL/Verilog format) and the SDC
(constraints file) is passed as input files to the Placement and Routing Tool
to perform the back-end Actitivities.
Step 9: The next step is the Floor-planning,
which means placing the IP's based on the connectivity,placing the memories,
Create the Pad-ring, placing the Pads(Signal/power/transfer-cells(to switch
voltage domains/Corner pads(proper accessibility for Package routing), meeting
the SSN requirements(Simultaneous Switching Noise) that when the high-speed bus
is switching that it doesn't create any noise related acitivities, creating an
optimised floorplan, where the design meets the utilization targets of the
chip.
- Step 9a : Release the
floor-planned information to the package team, to perform the package
feasibility analysis for the pad-ring .
- Step 9b: To the placement tool,
rows are cut, blockages are created where the tool is prevented from
placing the cells, then the physical placement of the cells is performed
based on the timing/area requirements.The power-grid is built to meet the
power-target's of the Chip .
Step 10: The next step is to
perform the Routing., at first the Global routing and Detailed routing, meeting
the DRC(Design Rule Check) requirement as per the fabrication requirement.
Step 11: After performing Routing then the
routed Verilog netlist, standard-cells LEF/DEF file is taken to the Extraction
tool (to extract the parasitics(RLC) values of the chip in the SPEF
format(Standard parasitics Exchange Format), and the SPEF file is generated. (
Tool: STARRC )
Step 12: Check whether the Design is meeting
the requirements (Functional/Timing/Area/Power/DFT/DRC/LVS/ERC/ESD/SI/IR-Drop)
after Placement and Routing step.
- Step 12a: Perform the Routed
Netlist-level Power Analysis, to know whether the design has met the power
targets.
- Step 12b: Perform Gate-level
Simulation with the routed Netlist to check whether the design is meeting
the functional requirement .
- Step 12c: Perform
Formal-verification between RTL vs routed Netlist to confirm that the
place & route Tool has not altered the functionality.
- Step 12d: Perform STA(Static
Timing Analysis) with the SPEF file and routed netlist file, to check
whether the Design is meeting the timing-requirements.
- Step 12e: Perform Scan-Tracing
, in the DFT tool, to check whether the scan-chain is built based on the
DFT requirement, Peform the Fault-coverage with the DFT tool and Generate
the ATPG test-vectors.
- Step 12f: Convert the ATPG
test-vector to a tester understandable format(WGL)
- Step 12g: Perform DRC(Design
Rule Check) verfication called as Physical-verification, to confirm that
the design is meeting the Fabrication requirements.
- Step 12h: Perform LVS(layout vs
Spice) check, a part of the verification which takes a routed netlist
converts to spice (call it SPICE-R) and convert the Synthesized
netlist(call it SPICE-S) and compare that the two are matching.
- Step 12i : Perform the
ERC(Electrical Rule Checking) check, to know that the design is meeting
the ERC requirement.
- Step 12j: Perform the ESD
Check, so that the proper back-to-back diodes are placed and proper
guarding is there in case if we have both analog and digital portions in
our Chip. We have seperate Power and Grounds for both Digital and Analog
Portions, to reduce the Substrate-noise.
- Step 12k: Perform seperate
STA(Static Timing Analysis) , to verify that the Signal-integrity of our
Chip. To perform this to the STA tool, the routed netlist and SPEF
file(parasitics including coupling capacitances values), are fed to the
tool. This check is important as the signal-integrity effect can cause
cross-talk delay and cross-talk noise effects, and hinder in the
functionality/timing aspects of the design.
- Step 12l: Perform IR Drop
analysis, that the Power-grid is so robust enough to with-stand the static
and dynamic power-drops with in the design and the IR-drop is with-in the
target limits.
Step 13: Once the routed design is verified
for the design constraints, then now the next step is chip-finishing activities
(like metal-slotting, placing de-coupling caps).
Step 14: Now the Chip Design is ready to go to
the Fabrication unit, release files which the fab can understand, GDS file.
Step 15: After the GDS file is released ,
perform the LAPO check so that the database released to the fab is correct.
Step 16: Perform the Package wire-bonding,
which connects the chip to the Package.
Synthesis is process of converting RTL (Synthesizable Verilog code) to technology specific gate level netlist (includes nets, sequential and combinational cells and their connectivity).
Goals of Synthesis
1. To get a gate level netlist
2. Inserting clock gates
3. Logic optimization
4. Inserting DFT logic
5. Logic equivalence between RTL and netlist should be maintained
Input files required
1. Tech related:
· .tf- technology related information.
· .lib-timing info of standard cell & macros
2. Design related:
· .v- RTL code.
· SDC- Timing constraints.
· UPF- power intent of the design.
· Scan config- Scan related info like scan chain length, scan IO, which flops are to be considered in the scan chains.
3. For Physical aware:
· RC co-efficient file (tluplus).
· LEF/FRAM- abstract view of the cell.
· Floorplan DEF- locations of IO ports and macros.
What is Synthesis?
explain what role the Synopsys DesignWare libraries fulfill in the synthesis process.
What is the difference between a high level synthesis tool (as represented by Synopsys behavioral Compiler) versus a logic synthesis tool (as represented by Synopsys Design Compiler)?
Explain what it meant for Synopsys DesignWare component to be ‘inferred’ by a synthesis tool?
What are different power reduction techniques?
How do you perform Synthesis activities in Multi vt libraries?
What are the advantages of clock gating?
One circuit will be given to you, where one of the inputs X have a high toggling rate in the circuit. What steps you take to reduce the power in that given circuit?
You will be told to realize a Boolean equation. The next question is how efficient usage of power is achieved in that circuit?
Some circuit will be given to you and will be instructed to set certain timing exceptions commands on that particular path.
What is the difference in PT timing analysis during post and pre layout designs?
What you mean by FSM States?
Draw the timing waveforms for the circuit given?
What is Setup time and hold time effects on the circuit behavior while providing different situations?
What is the difference of constraints file in Pre layout and post layout?
What is SPEF? Have you used it? How you can use it?
What difference you found (or can find) in the netlist and your timing behavior, while performing timing analysis in pre layout and post layout?
What is clock uncertainty, clock skew and clock jitter?
What is the reason for skew and jitter?
What is clock tree synthesis?
What are the timing related commands with respect to clock?
In front end, you set ideal network conditions on certain pins/clocks etc. Why? In Back end how is it taken care?
Which library you have used?
What difference you (can) find in TSMC and IBM libraries?
Draw the LSSD cell structure in TSMC and IBM libraries?
Every tool has some drawbacks? What drawbacks you find in Prime time?
What are the difference you find when you switch from 130nm to 90nm?
Explain the basic ASIC design flow? Where your work starts from? What is your role?
What is 90nm technology means?
What are the issues you faced in your designs?
Perform the setup and hold check for the given circuit.
Why setup and hold required for a flop?
You had any timing buffer between synthesis and P&R? How much should be the margin?
What are the inputs for synthesis and timing analysis from RTL and P&R team? Whether any inputs for changing the scripts?
How will you fix the setup and hold violation?
What are the constraints you used for the synthesis? Who decides the constraints?
What is uncertainty?
What is false path and multi cycle path? Give examples? For given example for false path what you will do for timing analysis?
What strategies used for the power optimization for your recent project?
Why max and min capacitance required?
You have two different frequency for launch (say 75Mhz) and capture (say 100Mhz).
what will happen to data? Write the waveform? If hold problem what you will do?
What is Metastability? How to overcome metastability? If metastable condition exists which frequency you will use as clock- faster or slower? Why?
Have you used formality? For a given block what checks it will do? How it verifies inside the block?
If you changed the port names during the synthesis how will you inform Formality?
Why you use power compiler? What is clock gating? What are advantage and disadvantages of clock gating? Write the clock gating circuit? Explain.
How will you control the clock gating inference for block of register? Write the command for the same?
Write the total power equation? What is leakage power? Write equation for it.
For clock gated flop and non clock gated flop outputs connected to a AND gate what problem can you expect? How to avoid the problem?
Write the sequence detector state which detects 10? How will optimize? write the verilog code for the same?
What is jitter? Why it will come? How to consider? What is the command for that?
What is clock latency? How to specify? What is the command for that?
What is dynamic timing analysis? What is the difference with static timing analysis? Which is accurate? Why it is accurate?
Give any example for Dynamic timing analysis? Do you know anything about GCL simulation?
What is free running clock?
What type of operating condition you consider for post layout timing analysis?
What is one-hot encoding technique? What are advantages? What are types of encoding?
Which scripting language you know?
Constant folding
Verilog constructs in synthesis
Unconstructs in synthesis
synthesis inputs
Outputts to be given to the PD team
Power optimization
Area optimization
DRC
Auto ungrouping
SPEF
DEF
How will you analysis the timing of different modes in design? How many modes you had in your design? What are the clock frequencies?
What your script contains?
Write the digital circuit for below condition: "when ever data changes from one to zero or zero to one the circuit should generate a pulse of one clock period length"?
Have come across any design with latches? What is the problem in timing analysis if you have latch in your design?
Have you come across any multiple clock design? What are the issues in multiple clock designs?
What you mean by synthesis strategies?
Latency,clock skew
clock constraint
ideal clock
Constraining register paths
Multiple output paths -constraints
combo path constraining
global skew,local skew
positive skew,negative skew,useful skew
Timing reports
group/ungroup\
Boundary optimization
top design low.bottom design flow.when preferred?
design aware why used?
translation,map,optimization
min delay,max delay
timing exceptions
operating conditions.how to pick for synthesis?
Latch in asic.problems
Logical lib vs physical library
LEF vs DEF.how significant
scan synthesis.lock up latches use
reset synchronizer.reset recovery and removal times
Network conditions on certain pins/clocks.why?
Register timing
pipeling,optimization technique.verilog used
derate factor
cross talk
is it true that synthesis transformations take less time at the top abstraction levels?
Is it true that synthesis transformations give refined results at the top abstraction levels?
What will a well formed case statement synthesize to?
What will happen to a design that is synthesized without any constraints?
STATIC TIMING ANALYSIS
Dynamic Timing Analysis
Advantages:
1. Extends coverage of circuit simulation (edges to region).
2. Evaluates worst-case timing using both minimum and maximum delay values for components.
3. Uses the same test stimulus as logic simulation.
4. Does not report false errors.
Disadvantages:
1. It is not complete.
2. It is not path oriented.
3. It is slower than logic simulation and may require additional test stimulus.
4. It requires functional behavioral models.
Dynamic timing analysis extends logic simulation by reporting violations in terms of simulation times and states. To test circuit timing using worst-case conditions, dynamic timing analysis evaluates the circuit using minimum and maximum propagation delays for each component for each component in the design.
Since dynamic timing analysis performs a simulation, it can use the same stimulus as a logic simulation. Because the stimulus functionally exercises the design, false errors of unused or uninteresting paths are not tested.
Note a timing simulation reports results differently than a logic simulation.
A logic simulation reports results as edge times and a timing simulation reports results as regions of ambiguity. The results of a timing simulation do not specify exactly when an event occurs, they specify a range of time in which an event can occur.
Static Timing Analysis
Advantages:
1.It resembles manual analysis methods.
2.It is path oriented and finds all setup and hold violations.
3.It does not require stimulus or functional models.
4.It is faster than simulation. (for the same amount of coverage)
Disadvantages:
1. It can report false errors.
2. It cannot detect timing errors related to logical operation.
Static timing analysis tools typically use timing models at the logic primitive level. The timing parameters are typically similar among different timing tools. The following are some of the common timing parameters for primitive logic gates, flip-flop and latch.
Timing Measurements for Primitive Gates
Transition time is the time between one specified voltage level and another voltage level for a given signal. Transition rise time is the time between a specified low voltage level and a specified high voltage level. Transition fall time is the time between a specified high voltage level and a specified low voltage level.
Setup time:-Setup time is the minimum time for which the control level needs to be mantain constant and should not change before the triggering edge of clock pulse.
Hold time:-Hold time is the minimum time for which the control level needs to be mantain constant and should not change after the triggering edge of clock pulse.
Meta-stability:-If the setup and hold window is violated, metastable state occurs where the output can not settle down to a particular state and keep oscillating between 0 and 1.
To recover from Metastability there are a number of techniques available:
One of them could be the use of a 2 Flip flop or 3 flip flop Synchronizer depending on MTBF(Mean time Before Failure) to provide the Metastability enough time to settle down at the output.
Force the flip flop to enter into a valid logic state so that it should not enter into Metastability or to wait at the output so that the circuit comes out of the metastability on its own.
Proper use of mux recirculation technique and mesochronous synchronizer also reduces Metastability.
Increase the clock frequency and adding buffers can also help in reducing set up violations.Generally in short when calculating timing on a logic circuit it is calculated on four different paths.
The paths are:-
Data path
Clock path
Clock gating path
Asynchronous path.
1)how to solve setup and hold violations in the design
To address setup time violations, you can: Use larger/stronger cells to drive paths with high capacitance, which can reduce the time needed to transition on sluggish net. Adjust the skew of the clock to the start or endpoint of the path which is violating. (time borrowing). Move gates around to make the total distance between different cells in the violating path smaller (less capacitance to drive = faster transitions) Insert retiming flops on the path, if the design will allow for it (try to do an operation in two clock cycles instead of one) Reduce the overall clock frequency. For hold time violations: ·Skew the clock to the start/endpoint (reverse of how to fix setup) to make the endpoint clock arrive earlier. Insert cells along the path to increase the propogation time (insert chains of buffers). Reduce the drive strength of cells on the path to make the transition time increase. Why Timing Constraints? Timing Constraints is an Important part of designing ASICs or FPGAs. Generally, we want to make sure that your design is functional by verification methods and to make sure that it will behave correctly after manufacturing by timing analysis.
How Timing Analysis? There are two ways to perform Timing Analysis Dynamic Timing Analysis requires a set of input vectors to check the timing characteristics of the paths in the design. If we have N inputs then we need to make 2^N simulation combinations to get full timing analysis. Static Timing Analysis checks timing violations without simulations. This is faster but doesn't check functionality issues. ASIC Design Flow: flow is referred to as RTL2GDSII flow and the process to generate GDSII is termed as tapeout.The ASIC digital flow is divided into Logical & Physical flow i.e. the Frontend and Backend. Logical design A- RTL Design Specification >>System Architecture >> RTL Design>> Functional Verification The flow starts with High-Level design Specification, the designer puts specification for Area, Speed and Power requirements.Then the designer starts setting Chip Architecture. RTL, Register Transfer Level, describing the functional behavior using HDL, hardware description languages, VHDL, Verilog or SystemVerilog . Functional Verification, verifiing the functionality using simulation. B- Synthesis Synthesis >>DFT >> Equivalence Checking >> Static Timing Analysis Synthesis, the first step of converting the RTL to gate netlist based on timing, power and area constraints,DFT, this step is for preparing the design for testability. Scan insertion is a common technique that helps to make all registers in the design controllable and observable. Equivalence Checking, this step is for verifying the functionality of gate netlist against the RTL description using formal verification techniques. STA, static timing analysis, a method of checking the ability of the design to meet the timing requirements statically without simulation. The designer is responsible of specifying 'Timing Constraints' to model how the design needs to be constrained & the STA tools check that the design meets the timing requirements. The designer uses an industry standard format 'SDC' Synopsys Design Constraints. STA on this stage acts as the bridge between logical and physical design Physical design A-Layout Floor Planning >>Placement >> Clock Tree Synthesis >> Routing The flow starts with Floor Planning, the logical blocks of the design are placed considering many optimization factors to account for Area, Speed and Power.Then Placement occurs where the connections between blocks are routed.Placement is followed by Clock Tree Synthesis to distribute the clock and reduce clock skew between different parts of the design.Then Routing the design is the final step to generate the layout. During the physical design, STA may be done multiple times to perform a more accurate timing analysis. B-Tapeout LVS >>DRC >> Signoff STA >> GDSII release Two steps are needed to verify the layout.LVS, Layout versus Netlist, matching the layout with the netlist generated after synthesis. DRC, Design Rule Checking, All rules laid out by the foundry where it will be fabricated into a chip are adhered.then Signoff Static Timing Analysis is performed .Finally, GDSII release, Fabs manufacture chips based on the GDSII. Timing Constraints From timing perspective, the designer creates timing constraints for synthesis which are a series of constraints applied to a given set of paths or nets that dictate the desired performance of a design. Constraints may be period, frequency, net skew, maximum delay between end points, or maximum net delay.. The designer uses an industry standard format 'SDC' Synopsys Design Constraints. Static Timing Analysis EDA tools check setup, hold and removal constraints, clock gating constraints, maximum frequency and any other design rules. They take design netlist, timing libraries, delay information and timing constraints as Inputs to perform static timing analysis. Static Timing Analysis is a method for determining if a circuit meets timing constraints without having to simulate so it is much faster than timing-driven, gate-level simulation. STA as well as Equivalence checking are performed in many steps in Digital design flow, after synthesis, scan, placement, clock tree synthesis or routing.
Timing Paths The paths are:-
Clock path Clock gating path Asynchronous path. 1)how to solve setup and hold violations in the design
|
What are the inputs you get for Block level Physical Design?
Netlist (.v /.vhd)
Timing Libraries (.lib/.db)
Library Exchange Format (LEF)
Technology files (.tf/.tech.lef)
Constrains (SDC)
Power Specification File
Clock Tree Constrains
Optimization requirements
IO Ports file
Floorplan file
What
are the different checks you do on the Input Netlist.
Floating Pins
Unconstrained pins
Undriven input ports
Unloaded output ports
Pin direction mismatches
Multiple Drivers
Zero wire load Timing checks
Issues with respect to the Library file, Timing
Constraints, IOs and Optimization requirements.
How to do macro Placement in a block
Analyse the fly-line for connectivity between Macros
to Macros and between the Macros to IO ports.
Group and Place the same hierarchy Macros together.
Calculate/Estimate the Channel length required
between Macros.
Avoid odd shapes
Place macros around the block periphery, so that
core area will have common logic.
Keep enough room around Macros for IO routing.
Give necessary blockages around the Macros like Halo
around the macros.
What are the issues you see if floorplan is bad.
Congestion near Macro corners due to insufficient
placement blockage.
Standard cell placement in narrow channels led to
congestion.
Macros of same partition which are placed far apart
can cause timing violation.
What
are different optimization techniques?
Cell Sizing: Size up or down to meet timing/area.
Vt Swapping
Cloning: fanout reduction
Buffering: Buffers are added in the middle of long
net paths to reduce the delay.
Logical restructuring: Breaking complex cells to
simpler cells or vice versa
Pin swapping
What are the inputs for the CTS.
CTS SDC
Max Skew
Max and Min Insertion Delay
Max Transition, Capacitance, Fanout
No of Buffer levels
Buffer/Inverter list
Clock Tree Routing Metal Layers
Clock tree Root pin, Leaf Pin, Preserve pin, through
pin and exclude pin
What is Metal Fill
Metal Density Rule helps to avoid Over Etching or
Metal Erosion.
Fill the empty metal tracks with metal shapes to
meet the metal density rules.
There are two types of Metal Fill
Floating Metal Fill: Does not completely shield the
aggressor nets, so SI will be there.
Grounded Metal Fill: Completely shield the aggressor
nets, less SI
Why the Metal Fill is required
If there is lot of gap between the routed metal
layers (empty tracks), during the process of Etching the etching material used
will fall more in this gap due to which Over Etching of existing metal occurs
which may create opens. So in order to have uniform Metal Density across the
chip, Dummy Metal is added in these empty tracks.
What are the reasons for routing congestion
Inefficient floorplan
Macro placement or macro channels is not proper.
Placement blockages not given
No Macro to Macro channel space given.
High cell density
High local utilization
High number of complex cells like AOI/OAI cells
which has more pin count are placed together.
Placement of std cells near macros
Logic optimization is not properly done.
Pin density is more on edge of block
Buffers added too many while optimization
IO ports are crisscrossed, it needs to be properly
aligned in order.
What are the different methods to reduce congestion.
Review the floorplan/macro placements according to
the block size and port placement.
Add proper placement blockages in channels and
around the macro boundaries.
Reduce the local density using the percentage
utilization/density screens.
Cell padding is applied for high pin density cells,
like AOI/OAI.
Check and reorder scan chain if needed.
Run the congestion driven placement with high
effort.
Check the power network is proper and on routing
tract. If it is not on track, adjacent routing tracts may not be used, so it
might lead to congestion
Why power stripes routed in the top metal layers?
The resistivity of top metal layers are less and
hence less IR drop is seen in power distribution network. If power stripes are
routed in lower metal layers this will use good amount of lower routing
resources and therefore it can create routing congestion.
Why do you use alternate routing approach HVH/VHV
(Horizontal-Vertical-Horizontal/ Vertical-Horizontal-Vertical)?
This approach allows routability of the design and better usage of routing resources.
What are several factors to improve propagation delay of standard cell?
Improve the input transition to the cell under consideration by up sizing the driver.
Reduce the load seen by the cell under consideration, either by placement
refinement or buffering.
If allowed increase the drive strength or replace with LVT (low threshold
voltage) cell.
How do you compute net delay (interconnect delay) /
decode RC values present in tech file?
What
are various ways of timing optimization in synthesis tools?
Logic optimization: buffer sizing, cell sizing, level adjustment, dummy buffering etc.
Less number of logics between Flip Flops speedup the
design.
Optimize drive strength of the cell , so it is
capable of driving more load and hence reducing the cell delay.
Better selection of design ware component (select
timing optimized design ware components).
Use LVT (Low threshold voltage) and SVT (standard
threshold voltage) cells if allowed.
What
would you do in order to not use certain cells from the library?
Set don’t use attribute on those library cells.
How
delays are characterized using WLM (Wire Load Model)?
For a given wireload model the delay are estimated based on the number of
fanout of the cell driving the net.
Fanout vs net length is tabulated in WLMs.
Values of unit resistance R and unit capacitance C are given in technology
file.
Net length varies based on the fanout number.
Once the net length is known delay can be calculated; Sometimes it is
again tabulated.
What
are various techniques to resolve congestion/noise?
Routing and placement congestion all depend upon the
connectivity in the netlist , a better floor plan can reduce the congestion.
Noise can be reduced by optimizing the overlap of
nets in the design.
Let’s
say there enough routing resources available, timing is fine, can you increase
clock buffers in clock network? If so will there be any impact on other
parameters?
No. You should not increase clock buffers in the
clock network. Increase in clock buffers cause more area , more power. When
everything is fine why you want to touch clock tree??
How
do you optimize skew/insertion delays in CTS (Clock Tree Synthesis)?
Better skew targets and insertion delay values
provided while building the clocks.
Choose appropriate tree structure – either based on
clock buffers or clock inverters or mix of clock buffers or clock inverters.
For multi clock domain, group the clocks while
building the clock tree so that skew is balanced across the clocks. (Inter
clock skew analysis).
What
are pros/cons of latch/FF (Flip Flop)?
How you go about fixing timing violations for latch- latch paths?
As an engineer, let’s say your manager comes to you
and asks for next project die size estimation/projection, giving data on RTL
size, performance requirements.
How
do you go about the figuring out and come up with die size considering physical
aspects?
How
will you design inserting voltage island scheme between macro pins crossing
core and are at different power wells? What is the optimal resource solution?
What
are various formal verification issues you faced and how did you resolve?
How
do you calculate maximum frequency given setup, hold, clock and clock skew?
What
are effects of metastability?
Consider a timing path crossing from fast clock
domain to slow clock domain. How do you design synchronizer circuit without
knowing the source clock frequency?
How
to solve cross clock timing path?
How
to determine the depth of FIFO/ size of the FIFO?
What
are the challenges you faced in place and route, FV (Formal Verification), ECO
(Engineering Change Order) areas?
How
long the design cycle for your designs?
What
part are your areas of interest in physical design?
Explain
ECO (Engineering Change Order) methodology.
Explain
CTS (Clock Tree Synthesis) flow.
What
kind of routing issues you faced?
How
does STA (Static Timing Analysis) in OCV (On Chip Variation) conditions done?
How do you set OCV (On Chip Variation) in IC compiler? How is timing
correlation done before and after place and route?
Process-Voltage-Temperature (PVT) Variations and Static Timing Analysis
(STA)
If
there are too many pins of the logic cells in one place within core, what kind
of issues would you face and how will you resolve?
Define
hash/ @array in perl.
Using
TCL (Tool Command Language, Tickle) how do you set variables?
What
is ICC (IC Compiler) command for setting derate factor/ command to perform
physical synthesis?
What
are nanoroute options for search and repair?
What
were your design skew/insertion delay targets?
How
is IR drop analysis done? What are various statistics available in reports?
Explain
pin density/ cell density issues, hotspots?
How
will you relate routing grid with manufacturing grid and judge if the routing
grid is set correctly?
What
is the command for setting multi cycle path?
If
hold violation exists in design, is it OK to sign off design? If not, why?
How
are timing constraints developed?
Explain
timing closure flow/methodology/issues/fixes.
Explain
SDF (Standard Delay Format) back annotation/ SPEF (Standard Parasitic Exchange
Format) timing correlation flow.
Given
a timing path in multi-mode multi-corner, how is STA (Static Timing Analysis)
performed in order to meet timing in both modes and corners, how are PVT
(Process-Voltage-Temperature)/derate factors decided and set in the Primetime
flow?
With
respect to clock gate, what are various issues you faced at various stages in
the physical design flow?
What
are synthesis strategies to optimize timing?
Explain
ECO (Engineering Change Order) implementation flow. Given post routed database
and functional fixes, how will you take it to implement ECO (Engineering Change
Order) and what physical and functional checks you need to perform?
In building the timing constraints, do you need to constrain all IO
(Input-Output) ports?
Can
a single port have multi-clocked? How do you set delays for such ports?
How
is scan DEF (Design Exchange Format) generated?
What
is purpose of lockup latch in scan chain?
Explain
short circuit current.
What
are pros/cons of using low Vt, high Vt cells?
Multi
Threshold Voltage Technique
Issues
With Multi Height Cell Placement in Multi Vt Flow
How
do you set inter clock uncertainty?
set_clock_uncertainty –from clock1 -to clock2
In
DC (Design Compiler), how do you constrain clocks, IO (Input-Output) ports,
maxcap, max tran?
What
are differences in clock constraints from pre CTS (Clock Tree Synthesis) to
post CTS (Clock Tree Synthesis)?
Difference
in clock uncertainty values; Clocks are propagated in post CTS.
In
post CTS clock latency constraint is modified to model clock jitter.
How
is clock gating done?
What
constraints you add in CTS (Clock Tree Synthesis) for clock gates?
Make the clock gating cells as through pins.
What
is trade off between dynamic power (current) and leakage power (current)?
Leakage Power Trends
Dynamic Power
How do you reduce standby (leakage)
power?
Explain
top level pin placement flow? What are parameters to decide?
Given
block level netlists, timing constraints, libraries, macro LEFs (Layout
Exchange Format/Library Exchange Format), how will you start floor planning?
With
net length of 1000um how will you compute RC values, using equations/tech file
info?
What
do noise reports represent?
What
does glitch reports contain?
What
are CTS (Clock Tree Synthesis) steps in IC compiler?
What
do clock constraints file contain?
How
to analyze clock tree reports?
What
do IR drop Voltagestorm reports represent?
Where
/when do you use DCAP (Decoupling Capacitor) cells?
What
are various power reduction techniques?
Low Power Design Techniques
What is setup/hold? What are setup and
hold time impacts on timing? How will you fix setup and hold violations?
Explain
function of Muxed FF (Multiplexed Flip Flop) /scan FF (Scal Flip Flop).
What
are tested in DFT (Design for Testability)?
In
equivalence checking, how do you handle scanen signal?
In terms of CMOS (Complimentary Metal Oxide
Semiconductor), explain physical parameters that affect the propagation delay?
What
are power dissipation components? How do you reduce them?
Short
Circuit Power
Leakage
Power Trends
Dynamic
Power
Low
Power Design Techniques
How
delay affected by PVT (Process-Voltage-Temperature)?
Process-Voltage-Temperature
(PVT) Variations and Static Timing Analysis (STA)
Why
is power signal routed in top metal layers?
How
do you minimize clock skew/ balance clock tree?
Given
11 minterms and asked to derive the logic function.
Given
C1= 10pf, C2=1pf connected in series with a switch in between, at t=0 switch is
open and one end having 5v and other end zero voltage; compute the voltage
across C2 when the switch is closed?
Explain
the modes of operation of CMOS (Complimentary Metal Oxide Semiconductor)
inverter? Show IO (Input-Output) characteristics curve.
Implement
a ring oscillator.
How
to slow down ring oscillator?
How
do you optimize power at various stages in the physical design flow?
What
timing optimization strategies you employ in pre-layout /post-layout stages?
What
are process technology challenges in physical design?
Design
divide by 2, divide by 3, and divide by 1.5 counters. Draw timing diagrams.
What
are multi-cycle paths, false paths? How to resolve multi-cycle and false paths?
Given
a flop to flop path with combo delay in between and output of the second flop
fed back to combo logic. Which path is fastest path to have hold violation and
how will you resolve?
What
are RTL (Register Transfer Level) coding styles to adapt to yield optimal
backend design?
Draw
timing diagrams to represent the propagation delay, set up, hold, recovery,
removal, minimum pulse width.
1.
You might want to include things like what are the different ways of fixing
antenna violations
-- there are about 4 methods
2. Why non-leaf clock nets are routed on top-most layers
3. Does jitter effect setup/hold paths?
BACKEND DESIGN INTERVIEWS
* What is signal integrity? How it affects
Timing?
* What is IR drop? How to avoid .how it affects timing?
* What is EM and it effects?
* What is floor plan and power plan?
* What are types of routing?
* What is a grid .why we need and different types of grids?
* What is core and how u will decide w/h ratio for core?
* What is effective utilization and chip utilization?
* What is latency? Give the types?
* What is LEF?
* What is DEF?
* What are the steps involved in designing an optimal pad ring?
* What are the steps that you have done in the design flow?
* What are the issues in floor plan?
* How can you estimate area of block?
* How much aspect ratio should be kept (or have you kept) and what is the
utilization?
* How to calculate core ring and stripe widths?
* What if hot spot found in some area of block? How you tackle this?
* After adding stripes also if you have hot spot what to do?
* What is threshold voltage? How it affect timing?
* What is content of lib, lef, sdc?
* What is meant my 9 track, 12 track standard cells?
* What is scan chain? What if scan chain not detached and reordered? Is it
compulsory?
* What is setup and hold? Why there are ? What if setup and hold violates?
* In a circuit, for reg to reg path ...Tclktoq is 50 ps, Tcombo 50ps, Tsetup
50ps, tskew is 100ps. Then what is the maximum operating frequency?
* How R and C values are affecting time?
* How ohm (R), fared (C) is related to second (T)?
* What is transition? What if transition time is more?
* What is difference between normal buffer and clock buffer?
* What is antenna effect? How it is avoided?
* What is ESD?
* What is cross talk? How can you avoid?
* How double spacing will avoid cross talk?
* What is difference between HFN synthesis and CTS?
* What is hold problem? How can you avoid it?
* For an iteration we have 0.5ns of insertion delay and 0.1 skew and for other
iteration 0.29ns insertion delay and 0.25 skew for the same circuit then which
one you will select? Why?
* What is partial floor plan?
* What parameters (or aspects) differentiate Chip Design & Block level
design??
* How do you place macros in a full chip design?
* Differentiate between a Hierarchical Design and flat design?
* Which is more complicated when u have a 48 MHz and 500 MHz clock design?
* Name few tools which you used for physical verification?
* What are the input files will you give for primetime correlation?
* What are the algorithms used while routing? Will it optimize wire length?
* How will you decide the Pin location in block level design?
* If the routing congestion exists between two macros, then what will you do?
* How will you place the macros?
* How will you decide the die size?
* If lengthy metal layer is connected to diffusion and poly, then which one
will affect by antenna problem?
* If the full chip design is routed by 7 layer metal, why macros are designed
using 5LM instead of using 7LM?
* In your project what is die size, number of metal layers, technology,
foundry, number of clocks?
* How many macros in your design?
* What is each macro size and no. of standard cell count?
* How did u handle the Clock in your design?
* What are the Input needs for your design?
* What is SDC constraint file contains?
* How did you do power planning?
* How to find total chip power?
* How to calculate core ring width, macro ring width and strap or trunk width?
* How to find number of power pad and IO power pads?
* What are the problems faced related to timing?
* How did u resolve the setup and hold problem?
* If in your design 10000 and more numbers of problems come, then what you will
do?
* In which layer do you prefer for clock routing and why?
* If in your design has reset pin, then it’ll affect input pin or output pin or
both?
* During power analysis, if you are facing IR drop problem, then how did u
avoid?
* Define antenna problem and how did u resolve these problem?
* How delays vary with different PVT conditions? Show the graph.
* Explain the flow of physical design and inputs and outputs for each step in
flow.
* What is cell delay and net delay?
* What are delay models and what is the difference between them?
* What is wire load model?
* What does SDC constraints has?
* Why higher metal layers are preferred for Vdd and Vss?
* What is logic optimization and give some methods of logic optimization.
* What is the significance of negative slack?
* How the width of metal and number of straps calculated for power and ground?
* What is negative slack ? How it affects timing?
* What is track assignment?
* What is grided and gridless routing?
* What is a macro and standard cell?
* What is congestion?
* Whether congestion is related to placement or routing?
* What are clock trees?
* What are clock tree types?
* Which layer is used for clock routing and why?
* What is cloning and buffering?
* What are placement blockages?
* How slow and fast transition at inputs effect timing for gates?
* What is antenna effect?
* What are DFM issues?
* What is .lib, LEF, DEF, .tf?
* What is the difference between synthesis and simulation?
* What is metal density, metal slotting rule?
* What is OPC, PSM?
* Why clock is not synthesized in DC?
* What are high-Vt and low-Vt cells?
* What corner cells contains?
* What is the difference between core filler cells and metal fillers?
* How to decide number of pads in chip level design?
* What is tie-high and tie-low cells and where it is used
Physical
Design Questions and Answers
What parameters (or aspects) differentiate Chip Design and Block level design?
Chip
design has I/O pads; block design has pins.
Chip
design uses all metal layes available; block design may not use all metal
layers.
Chip
is generally rectangular in shape; blocks can be rectangular, rectilinear.
Chip
design requires several packaging; block design ends in a macro.
How
do you place macros in a full chip design?
First
check flylines i.e. check net connections from macro to macro and macro to
standard cells.
If
there is more connection from macro to macro place those macros nearer to each
other preferably nearer to core boundaries.
If
input pin is connected to macro better to place nearer to that pin or pad.
If
macro has more connection to standard cells spread the macros inside core.
Avoid
criscross placement of macros.
Use
soft or hard blockages to guide placement engine.
Differentiate
between a Hierarchical Design and flat design?
Hierarchial
design has blocks, subblocks in an hierarchy; Flattened design has no subblocks
and it has only leaf cells.
Hierarchical
design takes more run time; Flattened design takes less run time.
Which
is more complicated when u have a 48 MHz and 500 MHz clock design?
500
MHz; because it is more constrained (i.e.lesser clock period) than 48 MHz
design.
Name
few tools which you used for physical verification?
Herculis
from Synopsys, Caliber from Mentor Graphics.
What are the input files will you give for primetime correlation?
Netlist,
Technology library, Constraints, SPEF or SDF file.
If the routing congestion exists between two macros, then what will you do?
Provide
soft or hard blockage
How will you decide the die size?
By
checking the total area of the design you can decide die size.
If lengthy metal layer is connected to diffusion and poly, then which one will
affect by antenna problem?
Poly
If the full chip design is routed by 7 layer metal, why macros are designed
using 5LM instead of using 7LM?
Because
top two metal layers are required for global routing in chip design. If top
metal layers are also used in block level it will create routing blockage.
In your project what is die size, number of metal layers, technology, foundry,
number of clocks?
Die
size: tell in mm eg. 1mm x 1mm ; remeber 1mm=1000micron which is a big size !!
Metal
layers: See your tech file. generally for 90nm it is 7 to 9.
Technology:
Again look into tech files.
Foundry:Again
look into tech files; eg. TSMC, IBM, ARTISAN etc
Clocks:
Look into your design and SDC file !
How many macros in your design?
You
know it well as you have designed it ! A SoC (System On Chip) design may have
100 macros also !!!!
What is each macro size and number of standard cell count?
Depends
on your design.
What are the input needs for your design?
For
synthesis: RTL, Technology library, Standard cell library, Constraints
For
Physical design: Netlist, Technology library, Constraints, Standard cell
library
What is SDC constraint file contains?
Clock
definitions
Timing
exception-multicycle path, false path
Input
and Output delays
How did you do power planning? How to calculate core ring width, macro ring
width and strap or trunk width? How to find number of power pad and IO power
pads? How the width of metal and number of straps calculated for power and
ground?
Get
the total core power consumption; get the metal layer current density value
from the tech file; Divide total power by number sides of the chip; Divide the
obtained value from the current density to get core power ring width. Then
calculate number of straps using some more equations. Will be explained in
detail later.
How
to find total chip power?
Total
chip power=standard cell power consumption,Macro power consumption pad power
consumption.
What are the problems faced related to timing?
Prelayout:
Setup, Max transition, max capacitance
Post
layout: Hold
How did you resolve the setup and hold problem?
Setup:
upsize the cells
Hold:
insert buffers
In which layer do you prefer for clock routing and why?
Next
lower layer to the top two metal layers(global routing layers). Because it has
less resistance hence less RC delay.
If in your design has reset pin, then it’ll affect input pin or output pin or
both?
Output
pin.
During power analysis, if you are facing IR drop problem, then how did you
avoid?
Increase
power metal layer width.
Go
for higher metal layer.
Spread
macros or standard cells.
Provide
more straps.
Define antenna problem and how did you resolve these problem?
Increased
net length can accumulate more charges while manufacturing of the device due to
ionisation process. If this net is connected to gate of the MOSFET it can
damage dielectric property of the gate and gate may conduct causing damage to
the MOSFET. This is antenna problem.
Decrease
the length of the net by providing more vias and layer jumping.
Insert
antenna diode.
How delays vary with different PVT conditions? Show the graph.
P
increase->dealy increase
P
decrease->delay decrease
V
increase->delay decrease
V
decrease->delay increase
T
increase->delay increase
T
decrease->delay decrease
Explain the flow of physical design and inputs and outputs for each step in
flow.
Click here to
see the flow diagram
What is cell delay and net delay?
Gate
delay
Transistors
within a gate take a finite time to switch. This means that a change on the
input of a gate takes a finite time to cause a change on the output.[Magma]
Gate
delay =function of(i/p transition time, Cnet+Cpin).
Cell
delay is also same as Gate delay.
Cell
delay
For
any gate it is measured between 50% of input transition to the corresponding
50% of output transition.
Intrinsic
delay
Intrinsic
delay is the delay internal to the gate. Input pin of the cell to output pin of
the cell.
It
is defined as the delay between an input and output pair of a cell, when a near
zero slew is applied to the input pin and the output does not see any load
condition.It is predominantly caused by the internal capacitance associated
with its transistor.
This
delay is largely independent of the size of the transistors forming the gate
because increasing size of transistors increase internal capacitors.
Net
Delay (or wire delay)
The
difference between the time a signal is first applied to the net and the time
it reaches other devices connected to that net.
It
is due to the finite resistance and capacitance of the net.It is also known as
wire delay.
Wire
delay =fn(Rnet , Cnet+Cpin)
What are delay models and what is the difference between them?
Linear
Delay Model (LDM)
Non
Linear Delay Model (NLDM)
What is wire load model?
Wire
load model is NLDM which has estimated R and C of the net.
Why higher metal layers are preferred for Vdd and Vss?
Because
it has less resistance and hence leads to less IR drop.
What is logic optimization and give some methods of logic optimization.
Upsizing
Downsizing
Buffer
insertion
Buffer
relocation
Dummy
buffer placement
What is the significance of negative slack?
negative
slack==> there is setup voilation==> deisgn can fail
What is signal integrity? How it affects Timing?
IR
drop, Electro Migration (EM), Crosstalk, Ground bounce are signal integrity
issues.
If
Idrop is more==>delay increases.
crosstalk==>there
can be setup as well as hold voilation.
What is IR drop? How to avoid? How it affects timing?
There
is a resistance associated with each metal layer. This resistance consumes
power causing voltage drop i.e.IR drop.
If
IR drop is more==>delay increases.
What is EM and it effects?
Due
to high current flow in the metal atoms of the metal can displaced from its
origial place. When it happens in larger amount the metal can open or bulging
of metal layer can happen. This effect is known as Electro Migration.
Affects:
Either short or open of the signal line or power line.
What are types of routing?
Global
Routing
Track
Assignment
Detail
Routing
What is latency? Give the types?
Source
Latency
It
is known as source latency also. It is defined as "the delay from the
clock origin point to the clock definition point in the design".
Delay
from clock source to beginning of clock tree (i.e. clock definition point).
The
time a clock signal takes to propagate from its ideal waveform origin point to
the clock definition point in the design.
Network
latency
It
is also known as Insertion delay or Network latency. It is defined as "the
delay from the clock definition point to the clock pin of the register".
The
time clock signal (rise or fall) takes to propagate from the clock definition
point to a register clock pin.
What is track assignment?
Second
stage of the routing wherein particular metal tracks (or layers) are assigned
to the signal nets.
What is congestion?
If
the number of routing tracks available for routing is less than the required
tracks then it is known as congestion.
Whether congestion is related to placement or routing?
Routing
What are clock trees?
Distribution
of clock from the clock source to the sync pin of the registers.
What are clock tree types?
H
tree, Balanced tree, X tree, Clustering tree, Fish bone
What
is cloning and buffering?
Cloning
is a method of optimization that decreases the load of a heavily loaded cell by
replicating the cell.
Buffering
is a method of optimization that is used to insert beffers in high fanout nets
to decrease the dealy.
What is the
difference between soft macro and hard macro?
What
is the difference between hard macro, firm macro and soft macro?
What are IPs?
Soft macros
Soft
macros are in synthesizable RTL.Soft macros are more flexible than firm or hard
macros.
Soft
macros are not specific to any manufacturing process.Soft macros have the
disadvantage of being somewhat unpredictable in terms of performance, timing,
area, or power.Soft macros carry greater IP protection risks because RTL source
code is more portable and therefore, less easily protected than either a
netlist or physical layout data.
From the physical design perspective, soft macro is any cell that has been placed and routed in a placement and routing tool such as Astro. (This is the definition given in Astro Rail user manual !).Soft macros are editable and can contain standard cells, hard macros, or other soft macros.
Firm macros
Firm
macros are in netlist format.Firm macros are optimized for
performance/area/power using a specific fabrication technology.
Firm
macros are more flexible and portable than hard macros.Firm macros are
predictive of performance and area than soft macros.
Hard macro
Hard
macros are generally in the form of hardware IPs (or we termed it as hardwre
IPs !).Hard macos are targeted for specific IC manufacturing technology.Hard
macros are block level designs which are silicon tested and proved.Hard macros
have been optimized for power or area or timing.
In
physical design you can only access pins of hard macros unlike soft macros
which allows us to manipulate in different way.You have freedom to move,
rotate, flip but you can't touch anything inside hard macros.
Very
common example of hard macro is memory. It can be any design which carries
dedicated single functionality (in general).. for example it can be a MP4
decoder.Be aware of features and characteristics of hard macro before you use
it in your design... other than power, timing and area you also should know pin
properties like sync pin, I/O standards etcLEF, GDS2 file format allows easy
usage of macros in different tools.
From the physical design (backend) perspective:
Hard
macro is a block that is generated in a methodology other than place and route
(i.e. using full custom design methodology) and is brought into the physical
design database (eg. Milkyway in Synopsys; Volcano in Magma) as a GDS2 file.
Synthesis
and placement of macros in modern SoC designs are challenging. EDA tools employ
different algorithms accomplish this task along with the target of power and area.
There are several research papers available on these subjects. Some of them can
be downloaded from the given link below.
What is
difference between normal buffer and clock buffer?
Clock
net is one of the High Fanout Net(HFN)s. The clock buffers are designed with
some special property like high drive strength and less delay. Clock buffers
have equal rise and fall time. This prevents duty cycle of clock signal from
changing when it passes through a chain of clock buffers.
Normal
buffers are designed with W/L ratio such that sum of rise time and fall time is
minimum. They too are designed for higher drive strength.
What is
difference between HFN synthesis and CTS?
For clock no synthesis is carried out in front end because no placement information of flip-flops So synthesis won't meet true skew targets . in backend clock tree synthesis tries to meet "skew" targets...It inserts clock buffers (which have equal rise and fall time, unlike normal buffers !)... There is no skew information for any HFNs.
Formal verification is same as Logic equivalence checking (LEC) for which the tools are formality by Synopsys and Conformal LEC by cadence. LEC is for RTL vs NETLIST comparison.Formal verification is for property check.
Formal
verification can be classified into 2 types:
1.
Logic equivalent.
2.
Property check.
LEC steps
In LEC
setup, there are several steps that designers need to follow:
Read
and elaborate reference design
Read
and elaborate revised design
Specify
“notranslate modules” for blackboxing if the module is a macro, or the module
has been LECed in block level.Set certain constraints, for example, when
comparing between RTL and synthesis netlist, set case analysis to ignore scan
ports.LEC key point mapping is the next step.
In
this step, Conformal LEC will first identify primary inputs and outputs, DFF,
Latch, Blackboxes, etc. as the key points in both reference design and revised
design; then pair corresponding reference and revised key points, and
perform LEC comparison after that.
Mapping methods:
There
are 2 LEC mapping method, i.e., name-based and function-based. By default,
Conformal LEC will first do name-based mapping, followed by function-based
mapping. This approach typically works really well.
In LEC report
generation, designers will typically dump the following reports:
report
unmapped points (-unreachable, -notmapped, -extra)
report
compare data (-class nonequivalent, -class abort)
report
black box (check if there’s any unexpected black boxes)
report
ignored inputs and outputs
report
pin constraints
report
output stuck at
report
floating signals
report
renaming rule (this guides how LEC performs key point mapping)
LEC
non-equivalence debug will be the last step.
Debugging
typically starts from unmapped points, and possible root cause includes:
Not
mapped BBOX pins causes NEQs (Use renaming rule if pins names not matched)
Not
mapped DFF/DLATCH/CUT/PI causes NEQs (Optimize and merge DFF/DLATCH in LEC;
Resolve
unbalanced loop cutting; Constrain test/scan signals)
Incorrect
mapping causes NEQs (Remap the incorrectly mapped pairs manually)
After
resolving all mapping issues, designers can start debugging real mismatches,
and possible root cause includes:
Unbalanced
floating signals can cause NEQs (Tie off floating pins in RTL, or use “add tie
signal” to individually tie each floating Z)
Logic
optimized differently b/w LEC and implementation causing NEQs
Debug
phase mapping (Recommended: turn off phase inversion in synthesis; Or auto
analysis & remap with phase)
Balanced
but opposite optimization constant
Constraints
depend on the stage of LEC.
1.
RTL to Pre-scan: no constraints. Just read the design, map and compare.
2.
Pre-Scan vs. Post-Scan: Test constraints to mask the test logic. Map and
compare.
3.
Post-scan vs. Post routed /CTS netlist ; No constraints.
do
file that is generated after synthesis for doing LEC sufficient to do LEC
Do
file is nothing but your own way of generating the commands.
1.
Set the log file
2.
Read the .lib
3.
Read the golden and revised files.
4.
Set the mapping rules.
5.
Compare.
6.
Write the reports.
If you have automated flow for generating do
file for LEC, thats good to start with.If you are using formality for Formal
verification , SVF file is useful , where as LEC doesnt support SVF constructs.
do file commands
Invoke
Conformal LEC inside Equivalence checking directory in non-GUI by using the
command “lec –xl –nogui -color -64 -dofile counter.do”
-xl :- Launches Encounter® Conformal® L with
Datapath and advanced equivalence checking capabilities
-nogui :- Starts the session in non-GUI mode
-color :-Turn on color-coded messaging when in
non-GUI mode
-64 :- Runs the Encounter® Conformal® software
in 64-bit mode
-dofile <filename> :- Runs the script
<filename> after starting LEC
set
log file <filename.log> - replace
Save
log file and replaces if any log file exist with same name if any.
Read
the Verilog library by entering:
read
library <filename> -verilog –both
Both
verilog and liberty format can be used but verilog format is preferred. Steps
to generate .v from .lib using Conformal is mentioned at the end of this
session]
-verilog :- to indicate that library is in
Verilog format
-both :- use same library to model or
structure both golden and revised design.
Read
the Golden Design (RTL)
read
design <filename> -verilog –golden
-verilog :- to indicate that RTL is coded in
Verilog
-golden :- to input the golden design
Read
the Revised Design:
read
design <filename> -verilog –revised
-verilog :- to indicate that netlist is in
Verilog
-revised :- to input the revised design
Ignore
the scan input(scan_in) and Scan output (scan_out) pins (as these instances are
not available in golden design and primary output key point is compare point)
add
ignored inputs scan_in –revised [ignores scan_in pin]
add
ignored outputs scan_out –revised [ignores scan_out pin]
Constraint
the scan enable (SE) pin to zero to keep the revised design in functional mode.
add
pin constraints 0 SE -revised [tool keeps the design in functional mode and
ignore scan_in pin while compare. Also scan_in is not a compare point]
Change
the mode of operation from ‘setup’ to ‘lec’
Set
system mode lec
Conformal
lec got two modes of operation i.e. SETUP mode and LEC mode. Setup mode is used
to prepare the design to be compared. Any command that affects the way the
design is modeled will need to be issued in this mode. LEC mode is where the
designs will get modeled, key points mapped and where the compare process takes
place.
Compare
golden Vs. revised netlist
add
compare points –all
compare
Once
the compare process is completed, Conformal LEC will print a summary report
that tells how many key points are equivalent, non-equivalent, aborted and not
compared.
Generate
verification report.
report
verification
Reports
a table of all violated checklist items
Use
command ‘set gui on’ to turn on GUI window. In case of mapping issue or
comparison issue or not equivalence, use mapping manager or debug manager or
Schematic viewer options in LEC to resolve the issue. Same is the flow to
compare netlist generated at different stages of physical design. Use proper
modelling directives and constraints
Create
.v from .lib
Invoke
LEC by using the command “lec -xl -nogui -64”
Read
library in liberty(.lib) format by using the command “read library <.lib
file> –liberty –both”
Write
out the verilog file by using the command “Write library <file name*>
-verilog”
*file
name can be any name with extension .v
Logical
Equivalence Checking Netlist vs Netlist problem
Comparing
Plain synthesis Netlist vs Low power synthesis Netlis- have only used clock
gating low power technique. there are unmapped points in Low power netlist (
Red coloured "U") which is of clock gating Since clock gating is not
present in plain synthesis netlist. i have enable flatten model -clock gating.
You
should not ignore these. "Plain Synthesis Netlist" Is
"Golden" and your "Low Power Synthesis NEtlist" is
"Revised" . Also as you have written that you can see the unmapped
points at cgc cells. add the cgc libs cells "Verilog definition" of
the cells, at the golden side- from where you are reading all the files and
parsing the same. so as to tell the tool to look for these cells and map them.
If
you check module by module, then it becomes simpler and In LEC our main motive is
that to check whether the flip flops and latches we had made in RTL are
required or some extra flops and latches are added which are not required there.
Then if this thing is checked module by module then whole process becomes
simpler.
in other terms LEC is used to check whether synthesized design(at gate level)
which will further go in back-end is according to specification which were
written in form of RTL, if we find any mismatch then its clear the RTL logic we
had written is not fully according to specification.
You
can either do a "flat" LEC run which first fully synthesize the
design, map it and then compare. Or hierarchical run which compares separately
modules and then can copy them to other and save time.If your design is very
large you can thing about diving it to sub design or do hierarchical run.
Equivalence Checking RTL vs RTL) would be the use of equivalence checkers which
were traditionally oriented towards combinational equivalence checking but have
some sequential capabilitie The advantage is that these tools are well known to
designers and are fairly mature. However, the tools are still primarily
oriented towards verifying the output of a synthesis tool. These tools are able
to deal with some changes in state encoding, allowing a synthesis tool to
perform some retiming operations. However, the debug environment is not ideally
suited to RTL designers: The user is presented
with a schematic displaying gate-level differences between the two
designs, without benefit of a waveform or stimulus.
Engineering
Change Order (ECO) is the process of making local changes to the design netlist
without re-running the entire synthesis and P&R from scratch.
•
ECO Types:
• Functional ECO
•
Change the functionality of the design
•
Non-functional ECO:
•
Fix timing, cross talk
•
Stage:
•
Pre-masks
•
Usage of standard cells to implement the modifications
•
Post-masks
•
Base layer taped-out, metal fix using spare cells
As
with other types of formal analysis, equivalence checking of a large design is
a tough mathematical problem. For most ASICs and simple FPGAs, the problem can
be simplified by mapping the state elements (memory and registers) between the
two designs. If this is possible, combinational equivalence checking needs to
consider only the combinational logic between the state elements.
Sequential
equivalence checking, in its purest form, treats two designs as black boxes in
which the input and outputs must match but the internal state can be entirely
different.
In practice, two designs may be similar but
not close enough for combinational equivalence checking. This is commonly the
case for FPGA synthesis, which may reorder state machines or migrate logic
across registers to meet timing requirements. Thus, proving formal equivalence
for FPGAs typically employs sequential checking for optimized parts of the
design and combinational checking for the remainder.
How
is equivalence checking related to other formal approaches?
It
is common to divide formal verification into equivalence checking and property
checking. The latter category includes both assertion-based verification and
automated formal solutions such as apps. A simple way to conceptually unify
these two categories is to think of equivalence checking as ABV on a model
containing both designs to be compared as well as a set of assertions
specifying that the outputs of the two designs must match. This does not
necessarily represent how actual tools are build, but it may help in
understanding.
Nothing
can replace formal analysis. Simulation acceleration uses hardware to run
faster, so it will execute more tests in a given time, but it still exercises
very little design behavior. In-circuit emulation (ICE) and prototypes built
using FPGAs are typically designed to be plugged into the end system in lieu of
the actual chip under design. They may run even faster than acceleration and
will exercise different aspects of the design. While ICE and prototypes are
valuable for hardware-software co-verification, they are in no way a substitute
for the exhaustive nature of formal verification.
It
is no longer sufficient for FPGA designers to defer verification to the
bring-up lab. Modern FPGAs are complex system-on-chip (SoC) devices with all
the complexity of their ASIC counterparts. The difficulty of lab debug and long
reprogramming cycles mandate that FPGA teams adopt ASIC-like verification
processes. Formal verification tools work equally well on both technologies and
have been adopted on many FPGA projects.
Equivalence
checking is arguably even more important for FPGAs than for ASICs. The FPGA
synthesis and place-and-route processes often make significant changes to the
design structure, including moving logic across register stages, in order to
maximize use of the underlying chip technology. These manipulations present
some risk of altering design
functionality. Formal equivalence checking can catch any such problems, but
sequential equivalence checking is required since there is no 1-to-1 mapping
between state elements in the RTL source and the optimized netlist.
Sequential
logic equivalence checking (SLEC) is effective in finding bugs in new logic
required to reduce dynamic power consumption, validating last minute ECOs, or
verifying that design optimizations aren’t too aggressive. It is also very
efficient in verifying safety mechanisms used in ISO 26262 and other fault
mitigating designs.
SLEC’s
effectiveness comes from using exhaustive formal verification algorithms, which
do not require a testbench; and indeed are completely automated so the user
does not need to know about formal technology themselves.
Given
the formal-based nature of the analysis, SLEC can prove functional equivalence
of the two designs for all inputs and all time, or identify any differences
between the two designs. In contrast, simulation-based approaches cannot prove
sequential equivalence. Indeed, even with well-written constrained-random
testbenches, simulation may find functional differences depending on the
quality of the testbenches but such analysis could still miss critical corner
cases. As such, SLEC can save a lot of resimulation time after small
modifications of the design.
Non-equivalence
means that they have different functionality.Unmapped means that Conformal
can't find object in Revised design according to rename rules.If your points
non-equivalent, check Revised design correctness.If they reported (REPORT
UNMAPPED POINTS) as unmapped make something like this :
Do the following to manually map key points:
1. Click the primary output pin in the Golden design to highlight it.
2. Right-click and choose Set Target Mapping Point from the pop-up menu.
3. Click the primary output pin in the Revised design to highlight it.
4. Right-click and choose Add Mapping Point – Non-invert from the pop-up menu.
Modified
the netlist to either use input declaration or use supply* for both side.then
sign-off by another person, it should be safe enough.
Unmapped
in lec
Unreachable
means there's no path from that keypoint to a primary output through any sort
of logic.Hence they can't have an effect on the behaviour of the design.
Conformal by default won't map them nor compare them. The typical causes:
1)
unused code in RTL
2)
spare gates
3)
disabled logic (e.g. pre to post test)
We
still like to show you have them though because they could be a problem if they
are unexpected.
LEC
Mapping and reports
There
are 2 LEC mapping method, i.e., name-based and function-based.
By
default, Conformal LEC will first do name-based mapping, followed by
function-based mapping. This approach typically works really well.
In LEC
report generation, designers will typically dump the following reports:
report
unmapped points (-unreachable, -notmapped, -extra)
report
compare data (-class nonequivalent, -class abort)
report
black box (check if there’s any unexpected black boxes)
report
ignored inputs and outputs
report
pin constraints
report
output stuck at
report
floating signals
report
renaming rule (this guides how LEC performs key point mapping)
LEC
non-equivalence debug will be the last step.
Debugging
typically starts from unmapped points, and possible root cause includes:
Not
mapped BBOX pins causes NEQs (Use renaming rule if pins names not matched)
Not
mapped DFF/DLATCH/CUT/PI causes NEQs (Optimize and merge DFF/DLATCH in LEC;
Resolve unbalanced loop cutting; Constrain test/scan signals)
Incorrect
mapping causes NEQs (Remap the incorrectly mapped pairs manually)
After
resolving all mapping issues, designers can start debugging real mismatches,
and possible root cause includes:
Unbalanced
floating signals can cause NEQs (Tie off floating pins in RTL, or use “add tie
signal” to individually tie each floating Z)
Logic
optimized differently b/w LEC and implementation causing NEQs
Debug
phase mapping (Recommended: turn off phase inversion in synthesis; Or auto
analysis & remap with phase)
Balanced
but opposite optimization constant
Abort
points
Abort
Points do come because of large fanin logic cone and if you have not used few
RTL rules like using X in RTL.In some cases. You need to figure out first why
they have got blackboxed. They are also important.
Blackboxes
causes and techniques
Blackbox
are produced in following cases:
1)Either
you had not provided any basic gate ,common file or you have provided extra
file in lec scripts but not included in synthesis (src_design.lst FILE)
2)Either there is problem in your functionality.You just check parametres and
null slices in your design.If you find and remove those null slices by changing
parametres.Then your blackboxes will automatically removed
After removing BB ,other problems will automatically be removed
Though
there are different techniques available like Hierarchical flow, OVF flow,
etc., for handling aborts, these have their own limitations. The technique
discussed here helps identify common points that have common functionality in the
RTL and Gate level netlist.
Adding
CUT points at these common points helps to break the huge and complex
combinational logic into smaller parts and this eventually helps in reducing
the aborts.
The
technique is user-friendly and easy to debug, has no Logic Equivalence coverage
loss and results in reduced runtime. This technique also has negligible impact
on the QoR of Gate-level netlist in terms of area and timing optimization.
unique technique to identify CUT points in LEC to resolve Aborts
With
increasing design complexities that also incorporate complex data structures
there is an increase in the number of aborts in the Logic Equivalence Checker
(LEC), which is mainly due to the limitation of the LEC tool in handling the
complex logic cone.
Aborts
are an inconclusive result that the formal verification tool is not able to
resolve. Aborts can be due to:
i)
Complex datapath;
ii)
Huge logic cone for comparison; and
iii)
Large number of don’t cares.
The
greater the number of aborts in a design, the lesser the LEC coverage and the
greater the probability of missing some non-equivalence in a design. Though
there are different techniques for resolving aborts, they either involve
complex methodologies or a lot of manual processes. It is always imperative
that any technique applied should have a shorter turn-around-time, possess
minimum LEC coverage loss, and be designer friendly
Adding
cut points is a preferred way of avoiding aborts and has an advantage of having
no LEC coverage loss. Though the LEC tool itself has the capacity to add cut
points at certain points in the combinational logic, the position of these cut
points usually are not design friendly. The position of the cut point is
necessary in order for a correct comparison and to avoid false
non-equivalencies.
Adding a cut point to the hierarchy actually
helps to see same point at the RTL and Gate-level netlist. LEC treats the
input and output of the comparison elements separately; hence the input of a
cut point is verified during the LEC comparison so adding cut points should not
cause any problems. Also, adding a cut point partitions the data path and
allows the LEC tool to see reduced data cone and hence be able to resolve the
aborts.
Unmapped
points in LEC
Simplest
thing is to provide the SVF file from synthesis. This will tell the tool what
logic is removed, what is optimized and how its optimized.
This resolves most issues.
how
to enlarge the number of unmapped keypoint when doing LEC
Plus the -max_unmap behand remodel command .
When
I perform the lec using the cadence tools CONFORMAL, There's too much of
umapped key points in my design. This bring to the huge number of unequivalent
when i do the comparison between the revise and golden of my rtl and gate.
try
to check constraints, check libraries check options, maybe you might need some
renaming rules, generate reports to get some info from there. you should clear
the un-mapped points before comparison. Start based on categories like Not-mapped,
Unreachable & Extra.
If
you are doing formal check between RTL and Netlist, there is possiblity that
synthesis tool has optimized some signals in RTL and removed unused signals. IN
this case if LEC is unable to understand the logic, you may get un-mapped
points.
Another
case is when you insert DFT logic on synthesis netlist (scan insertion etc),
you may get some extra ports or signals used for DFT.Go through the list of
un-mapped points. Discuss them with concern RTL designer.Un-mapped points are
OK unless synthesis tool has done some un-wanted optimisation.
set
mapping method -name first –unreach
Unreachable FF/Latch in LEC means, it is the
FF/Latch which is not effecting the functionality of the
design.So, If you know the functionality of the design very well, validate
that, really this FF is not going to effect the functionality of the design or
not.If it is really unreachable, no need to worry. You can ignore this warning.
If you have very little information about the
design, then try the following steps to map unreachable points.
You said that unreachable points are in golden
design.
read verilog golden.v -goldend -keep_unreach
read verilog revised.v -revised -keep_unreach
set mapping method –unreach
Scenario of the Problem I am facing.1.Read a
Golden Design and a Revised DesignMap the Points and Compare2.Read the Same
Golden Design and The Different Revised Design.
This time the Revised Design has been created by
changing some Constarints in DC
The Problem I am facing isb Some Points in the
Golden Design are not reachable for the First Run , But they are NOT_MAPPED for
the Second RunHow this can be?
because
Conformal Says NOT_MAPPED unmapped points are reachable but do not have
correspoding point in the Logic Fanin Cone of the Corresponding Design.
If you look at the definition of ' Unreachable'
points - they are those points which will not reach any of the compare
points/outputs. Obviusly then they do not influence any of the functionality.
Therefore it is possible that synthesis (during optimization) has removed this
point. Now it is unmapped as there is no corrosponding equivalent point in the
revised design. Looks Logical.
By the way, check the log file. If it says
"DESIGNS EQUAL" then you are fine.
Some times the not-mapped points are due to the
sequential merging of the logic during the synthesis
Look in to your synthesis log file if there are
any sequentially merged statements present .
If present then add the instance equivalences
constraints in the do script and run the lec and this may solve the issue.
procedure to find this in log file
During compare not-mapped points are a problem
they ahve to ve be debugged and esnured that they are mapped and compared -
this could be due to many reasons - incorrect mapping, incorrect modelling,
incoreect pin constraints etc
Unreachables could beclock gating latches etc
which are not present in the revised or points that do not reach a compare
point; but these has to be reviewed as well since this your sign-off
If your performing RTL2Gate that means your
using hierarchical compare (if not pl stry this) now check the not-mapped point
if in golden then see in the synthesis logfile if it was optimized away say as
a constant or a seq merge - in which case ensure that your LEC dofile has the
necessary "set flatten model" options to address that modelling.
No comments:
Post a Comment