* What is signal integrity? How it affects Timing?
* What is IR drop? How to avoid .how it affects timing?
* What is EM and it effects?
* What is floor plan and power plan?
* What are types of routing?
* What is a grid .why we need and different types of grids?
* What is core and how u will decide w/h ratio for core?
* What is effective utilization and chip utilization?
* What is latency? Give the types?
* What is LEF?
* What is DEF?
* What are the steps involved in designing an optimal pad ring?
* What are the steps that you have done in the design flow?
* What are the issues in floor plan?
* How can you estimate area of block?
* How much aspect ratio should be kept (or have you kept) and what is the
utilization?
* How to calculate core ring and stripe widths?
* What if hot spot found in some area of block? How you tackle this?
* After adding stripes also if you have hot spot what to do?
* What is threshold voltage? How it affect timing?
* What is content of lib, lef, sdc?
* What is meant my 9 track, 12 track standard cells?
* What is scan chain? What if scan chain not detached and reordered? Is it
compulsory?
* What is setup and hold? Why there are ? What if setup and hold violates?
* In a circuit, for reg to reg path ...Tclktoq is 50 ps, Tcombo 50ps, Tsetup 50ps,
tskew is 100ps. Then what is the maximum operating frequency?
* How R and C values are affecting time?
* How ohm (R), fared (C) is related to second (T)?
* What is transition? What if transition time is more?
* What is difference between normal buffer and clock buffer?
* What is antenna effect? How it is avoided?
* What is ESD?
* What is cross talk? How can you avoid?
* How double spacing will avoid cross talk?
* What is difference between HFN synthesis and CTS?
* What is hold problem? How can you avoid it?
* For an iteration we have 0.5ns of insertion delay and 0.1 skew and for other
iteration 0.29ns insertion delay and 0.25 skew for the same circuit then which
one you will select? Why?
* What is partial floor plan?
* What parameters (or aspects) differentiate Chip Design & Block level
design??
* How do you place macros in a full chip design?
* Differentiate between a Hierarchical Design and flat design?
* Which is more complicated when u have a 48 MHz and 500 MHz clock design?
* Name few tools which you used for physical verification?
* What are the input files will you give for primetime correlation?
* What are the algorithms used while routing? Will it optimize wire length?
* How will you decide the Pin location in block level design?
* If the routing congestion exists between two macros, then what will you do?
* How will you place the macros?
* How will you decide the die size?
* If lengthy metal layer is connected to diffusion and poly, then which one
will affect by antenna problem?
* If the full chip design is routed by 7 layer metal, why macros are designed
using 5LM instead of using 7LM?
* In your project what is die size, number of metal layers, technology,
foundry, number of clocks?
* How many macros in your design?
* What is each macro size and no. of standard cell count?
* How did u handle the Clock in your design?
* What are the Input needs for your design?
* What is SDC constraint file contains?
* How did you do power planning?
* How to find total chip power?
* How to calculate core ring width, macro ring width and strap or trunk width?
* How to find number of power pad and IO power pads?
* What are the problems faced related to timing?
* How did u resolve the setup and hold problem?
* If in your design 10000 and more numbers of problems come, then what you will
do?
* In which layer do you prefer for clock routing and why?
* If in your design has reset pin, then it’ll affect input pin or output pin or
both?
* During power analysis, if you are facing IR drop problem, then how did u
avoid?
* Define antenna problem and how did u resolve these problem?
* How delays vary with different PVT conditions? Show the graph.
* Explain the flow of physical design and inputs and outputs for each step in
flow.
* What is cell delay and net delay?
* What are delay models and what is the difference between them?
* What is wire load model?
* What does SDC constraints has?
* Why higher metal layers are preferred for Vdd and Vss?
* What is logic optimization and give some methods of logic optimization.
* What is the significance of negative slack?
* How the width of metal and number of straps calculated for power and ground?
* What is negative slack ? How it affects timing?
* What is track assignment?
* What is grided and gridless routing?
* What is a macro and standard cell?
* What is congestion?
* Whether congestion is related to placement or routing?
* What are clock trees?
* What are clock tree types?
* Which layer is used for clock routing and why?
* What is cloning and buffering?
* What are placement blockages?
* How slow and fast transition at inputs effect timing for gates?
* What is antenna effect?
* What are DFM issues?
* What is .lib, LEF, DEF, .tf?
* What is the difference between synthesis and simulation?
* What is metal density, metal slotting rule?
* What is OPC, PSM?
* Why clock is not synthesized in DC?
* What are high-Vt and low-Vt cells?
* What corner cells contains?
* What is the difference between core filler cells and metal fillers?
* How to decide number of pads in chip level design?
* What is tie-high and tie-low cells and where it is used
What parameters (or aspects) differentiate Chip Design and Block level design?
Chip
design has I/O pads; block design has pins.
Chip
design uses all metal layes available; block design may not use all metal
layers.
Chip
is generally rectangular in shape; blocks can be rectangular, rectilinear.
Chip
design requires several packaging; block design ends in a macro.
How
do you place macros in a full chip design?
First
check flylines i.e. check net connections from macro to macro and macro to
standard cells.
If
there is more connection from macro to macro place those macros nearer to each
other preferably nearer to core boundaries.
If
input pin is connected to macro better to place nearer to that pin or pad.
If
macro has more connection to standard cells spread the macros inside core.
Avoid
criscross placement of macros.
Use
soft or hard blockages to guide placement engine.
Differentiate
between a Hierarchical Design and flat design?
Hierarchial
design has blocks, subblocks in an hierarchy; Flattened design has no subblocks
and it has only leaf cells.
Hierarchical
design takes more run time; Flattened design takes less run time.
Which
is more complicated when u have a 48 MHz and 500 MHz clock design?
500
MHz; because it is more constrained (i.e.lesser clock period) than 48 MHz
design.
Name
few tools which you used for physical verification?
Herculis
from Synopsys, Caliber from Mentor Graphics.
What are the input files will you give for primetime correlation?
Netlist,
Technology library, Constraints, SPEF or SDF file.
If the routing congestion exists between two macros, then what will you do?
Provide
soft or hard blockage
How will you decide the die size?
By
checking the total area of the design you can decide die size.
If lengthy metal layer is connected to diffusion and poly, then which one will
affect by antenna problem?
Poly
If the full chip design is routed by 7 layer metal, why macros are designed
using 5LM instead of using 7LM?
Because
top two metal layers are required for global routing in chip design. If top
metal layers are also used in block level it will create routing blockage.
In your project what is die size, number of metal layers, technology, foundry,
number of clocks?
Die
size: tell in mm eg. 1mm x 1mm ; remeber 1mm=1000micron which is a big size !!
Metal
layers: See your tech file. generally for 90nm it is 7 to 9.
Technology:
Again look into tech files.
Foundry:Again
look into tech files; eg. TSMC, IBM, ARTISAN etc
Clocks:
Look into your design and SDC file !
How many macros in your design?
You
know it well as you have designed it ! A SoC (System On Chip) design may have
100 macros also !!!!
What is each macro size and number of standard cell count?
Depends
on your design.
What are the input needs for your design?
For
synthesis: RTL, Technology library, Standard cell library, Constraints
For
Physical design: Netlist, Technology library, Constraints, Standard cell
library
What is SDC constraint file contains?
Clock
definitions
Timing
exception-multicycle path, false path
Input
and Output delays
How did you do power planning? How to calculate core ring width, macro ring
width and strap or trunk width? How to find number of power pad and IO power
pads? How the width of metal and number of straps calculated for power and
ground?
Get
the total core power consumption; get the metal layer current density value
from the tech file; Divide total power by number sides of the chip; Divide the
obtained value from the current density to get core power ring width. Then
calculate number of straps using some more equations. Will be explained in
detail later.
How
to find total chip power?
Total
chip power=standard cell power consumption,Macro power consumption pad power
consumption.
What are the problems faced related to timing?
Prelayout:
Setup, Max transition, max capacitance
Post
layout: Hold
How did you resolve the setup and hold problem?
Setup:
upsize the cells
Hold:
insert buffers
In which layer do you prefer for clock routing and why?
Next
lower layer to the top two metal layers(global routing layers). Because it has
less resistance hence less RC delay.
If in your design has reset pin, then it’ll affect input pin or output pin or
both?
Output
pin.
During power analysis, if you are facing IR drop problem, then how did you
avoid?
Increase
power metal layer width.
Go
for higher metal layer.
Spread
macros or standard cells.
Provide
more straps.
Define antenna problem and how did you resolve these problem?
Increased
net length can accumulate more charges while manufacturing of the device due to
ionisation process. If this net is connected to gate of the MOSFET it can
damage dielectric property of the gate and gate may conduct causing damage to
the MOSFET. This is antenna problem.
Decrease
the length of the net by providing more vias and layer jumping.
Insert
antenna diode.
How delays vary with different PVT conditions? Show the graph.
P
increase->dealy increase
P
decrease->delay decrease
V
increase->delay decrease
V
decrease->delay increase
T
increase->delay increase
T
decrease->delay decrease
Explain the flow of physical design and inputs and outputs for each step in
flow.
Click here to
see the flow diagram
What is cell delay and net delay?
Gate
delay
Transistors
within a gate take a finite time to switch. This means that a change on the
input of a gate takes a finite time to cause a change on the output.[Magma]
Gate
delay =function of(i/p transition time, Cnet+Cpin).
Cell
delay is also same as Gate delay.
Cell
delay
For
any gate it is measured between 50% of input transition to the corresponding
50% of output transition.
Intrinsic
delay
Intrinsic
delay is the delay internal to the gate. Input pin of the cell to output pin of
the cell.
It
is defined as the delay between an input and output pair of a cell, when a near
zero slew is applied to the input pin and the output does not see any load
condition.It is predominantly caused by the internal capacitance associated
with its transistor.
This
delay is largely independent of the size of the transistors forming the gate
because increasing size of transistors increase internal capacitors.
Net
Delay (or wire delay)
The
difference between the time a signal is first applied to the net and the time
it reaches other devices connected to that net.
It
is due to the finite resistance and capacitance of the net.It is also known as
wire delay.
Wire
delay =fn(Rnet , Cnet+Cpin)
What are delay models and what is the difference between them?
Linear
Delay Model (LDM)
Non
Linear Delay Model (NLDM)
What is wire load model?
Wire
load model is NLDM which has estimated R and C of the net.
Why higher metal layers are preferred for Vdd and Vss?
Because
it has less resistance and hence leads to less IR drop.
What is logic optimization and give some methods of logic optimization.
Upsizing
Downsizing
Buffer
insertion
Buffer
relocation
Dummy
buffer placement
What is the significance of negative slack?
negative
slack==> there is setup voilation==> deisgn can fail
What is signal integrity? How it affects Timing?
IR
drop, Electro Migration (EM), Crosstalk, Ground bounce are signal integrity
issues.
If
Idrop is more==>delay increases.
crosstalk==>there
can be setup as well as hold voilation.
What is IR drop? How to avoid? How it affects timing?
There
is a resistance associated with each metal layer. This resistance consumes
power causing voltage drop i.e.IR drop.
If
IR drop is more==>delay increases.
What is EM and it effects?
Due
to high current flow in the metal atoms of the metal can displaced from its
origial place. When it happens in larger amount the metal can open or bulging
of metal layer can happen. This effect is known as Electro Migration.
Affects:
Either short or open of the signal line or power line.
What are types of routing?
Global
Routing
Track
Assignment
Detail
Routing
What is latency? Give the types?
Source
Latency
It
is known as source latency also. It is defined as "the delay from the
clock origin point to the clock definition point in the design".
Delay
from clock source to beginning of clock tree (i.e. clock definition point).
The
time a clock signal takes to propagate from its ideal waveform origin point to
the clock definition point in the design.
Network
latency
It
is also known as Insertion delay or Network latency. It is defined as "the
delay from the clock definition point to the clock pin of the register".
The
time clock signal (rise or fall) takes to propagate from the clock definition
point to a register clock pin.
What is track assignment?
Second
stage of the routing wherein particular metal tracks (or layers) are assigned
to the signal nets.
What is congestion?
If
the number of routing tracks available for routing is less than the required
tracks then it is known as congestion.
Whether congestion is related to placement or routing?
Routing
What are clock trees?
Distribution
of clock from the clock source to the sync pin of the registers.
What are clock tree types?
H
tree, Balanced tree, X tree, Clustering tree, Fish bone
What
is cloning and buffering?
Cloning
is a method of optimization that decreases the load of a heavily loaded cell by
replicating the cell.
Buffering
is a method of optimization that is used to insert beffers in high fanout nets
to decrease the dealy.
What is the difference between soft macro and hard
macro?
What
is the difference between hard macro, firm macro and soft macro?
What
are IPs?
Hard
macro, firm macro and soft macro are all known as IP (Intellectual property).
They are optimized for power, area and performance. They can be purchased and
used in your ASIC or FPGA design implementation flow. Soft macro is flexible
for all type of ASIC implementation. Hard macro can be used in pure ASIC design
flow, not in FPGA flow. Before bying any IP it is very important to evaluate
its advantages and disadvantages over each other, hardware compatibility such
as I/O standards with your design blocks, reusability for other designs.
Soft macros
Soft
macros are in synthesizable RTL.Soft macros are more flexible than firm or hard
macros.
Soft
macros are not specific to any manufacturing process.Soft macros have the
disadvantage of being somewhat unpredictable in terms of performance, timing,
area, or power.Soft macros carry greater IP protection risks because RTL source
code is more portable and therefore, less easily protected than either a
netlist or physical layout data.
From
the physical design perspective, soft macro is any cell that has been placed
and routed in a placement and routing tool such as Astro. (This is the
definition given in Astro Rail user manual !)
Soft
macros are editable and can contain standard cells, hard macros, or other soft
macros.
Firm macros
Firm
macros are in netlist format.Firm macros are optimized for
performance/area/power using a specific fabrication technology.
Firm
macros are more flexible and portable than hard macros.Firm macros are
predictive of performance and area than soft macros.
Hard macro
Hard
macros are generally in the form of hardware IPs (or we termed it as hardwre
IPs !).Hard macos are targeted for specific IC manufacturing technology.Hard
macros are block level designs which are silicon tested and proved.Hard macros
have been optimized for power or area or timing.
In
physical design you can only access pins of hard macros unlike soft macros
which allows us to manipulate in different way.You have freedom to move,
rotate, flip but you can't touch anything inside hard macros.
Very
common example of hard macro is memory. It can be any design which carries
dedicated single functionality (in general).. for example it can be a MP4
decoder.Be aware of features and characteristics of hard macro before you use
it in your design... other than power, timing and area you also should know pin
properties like sync pin, I/O standards etcLEF, GDS2 file format allows easy
usage of macros in different tools.
From the physical design (backend) perspective:
Hard
macro is a block that is generated in a methodology other than place and route
(i.e. using full custom design methodology) and is brought into the physical
design database (eg. Milkyway in Synopsys; Volcano in Magma) as a GDS2 file.
Synthesis
and placement of macros in modern SoC designs are challenging. EDA tools employ
different algorithms accomplish this task along with the target of power and
area. There are several research papers available on these subjects. Some of
them can be downloaded from the given link below.
What is difference between normal buffer and clock
buffer?
Clock
net is one of the High Fanout Net(HFN)s. The clock buffers are designed with
some special property like high drive strength and less delay. Clock buffers
have equal rise and fall time. This prevents duty cycle of clock signal from
changing when it passes through a chain of clock buffers.
Normal
buffers are designed with W/L ratio such that sum of rise time and fall time is
minimum. They too are designed for higher drive strength.
What is difference between HFN synthesis and CTS?
HFNs
are synthesized in front end also.... but at that moment no placement
information of standard cells are available... hence backend tool collapses
synthesized HFNs. It resenthesizes HFNs based on placement information and
appropriately inserts buffer. Target of this synthesis is to meet delay
requirements i.e. setup and hold.
For
clock no synthesis is carried out in front end because no placement information
of flip-flops So synthesis won't meet
true skew targets . in backend clock tree synthesis tries to meet
"skew" targets...It inserts clock buffers (which have equal rise and
fall time, unlike normal buffers !)... There is no skew information for any
HFNs.
1: How would you speed up
an ASIC design project by parallel computing? Which design stages can be
distributed for parallel computing, which cannot, and what procedures are
needed for maintaining parallel computing?
Mentioning the following important steps in parallel computing is essential:
1. Partitioning the design
2. Distributing partitioned tasks among multiple CPUs
3. Integrating the results
STAGES: The following answers are acceptable. Others may be accepted if
you gave a reasonable explanation of why you can or cannot use parallel
computing in a particular stage of the flow.
Can use parallel computing:
Synthesis after
partitioning
Placement (hierarchical design)
- Detailed routing
- DRC
- Functional verification
- Timing Analysis (partition the timing graph)
Cannot use parallel computing:
- Synthesis before partitioning
- Floorplanning
- Flat Placement
- Global Routing
CONSTRAINTS: Mentioning that care must be taken to make sure that
partition boundaries are consistent when integrating the results back together.
Q2: What kinds of timing violations are in a typical timing analysis report?
Explain!
Ans: Acceptable answers...
- Setup time violations
- Hold time violations
- Minimum delay
- Maximum delay
- Slack
- External delay
Q3: List the possible techniques to fix a timing violation.
-
Buffering - Buffers are inserted in the design to drive a load that
is too large for a logic cell to efficiently drive. If the net is too long
then the net is broken and buffers are inserted to improve the transition which
will ultimately improve the timing on data path and reduce the setup violation.
To reduce the hold violations buffers are inserted to add delay on data paths.-
Mapping - Mapping
converts primitive logic cells found in a netlist to technology-specific logic
gates found in the library on the timing critical paths.
- Unmapping - Unmapping converts the technology-specific logic gates in
the netlist to primitive logic gates on the timing critical paths.
- Pin swapping - Pin swapping optimization examines the slacks on the
inputs of the gates on worst timing paths and optimizes the timing by swapping
nets attached to the input pins, so the net with the least amount of slack is
put on the fastest path through the gate without changing the function of the
logic.
- Wire sizing
- Transistor (cell) sizing - Cell sizing is the process of assigning drive
strength for a specific cell in the library to a cell instance in the design.
If there is a low drive strength cell in the timing critical path then this
cell is replaced by higher drive strength cell to reduce the timing violation.
- Re-routing
- Placement updates
- Re-synthesis (logic transformations)
- Cloning - Cell cloning is a method of optimization that decreases the
load of a very heavily loaded cell by replicating the cell. Replication is done
by connecting an identical cell to the same inputs as the original cell.Cloning
clones the cell to divide the fanout load to improve the timing.
- Taking advantage of useful skew
- Logic re-structuring/Transformation (w/Resynthesis) - Rearrange logic to meet
timing constraints on critical paths of design
- Making sure we don't have false violations (false path, etc.)
Q4: Give the linear time computation scheme for Elmore delay in an RC
interconnect tree.
The following is acceptable..
.
- Elmore delay formula
T = Sum over all nodes i in path (s,t) of Ri*Ci where Ci is the total
capacitance in the sub tree rooted at node i, or alternatively, the sum over
the capacitances at the nodes times the shared resistance between the path of
interest and the path to the node.
- Explaining terms in formula
- Mentioning something that shows that it can be done in linear time
("lumped"
or "shared" resistances, "recursive" calculations, etc)
Q5: Given a unit wire resistance "r" and a unit wire capacitance
"c", a wire segment of length "l" and width "w"
has resistance "l/w" and capacitance "cwl". Can we reduce
the Elmore delay by changing the width of a wire segment? Explain your answer.
You needed to mention that by scaling different segments by different amounts,
you can reduce the delay (e.g. wider segments near the root and narrower
segments near the leaves. Delay is independent of width because the
"w" term cancels out.
Q6: Extend the ZST-DME algorithm to embed a binary tree such that the Elmore
delay from the root to each leaf of the tree is identical.
You needed to mention that a new procedure is needed for calculating the Elmore
delay assuming that certain merging points are chosen, instead of just the
total downstream wire-length. The merging segment becomes a set of points with
equal Elmore delay instead of just equal path length. You could refer the paper
"Low-Cost Single-Layer Clock Trees With Exact Zero Elmore Delay
Skew", Andrew B. Kahng and Chung-Wen Albert Tsao.
Q7: IPO (sometimes also referred to as "In-Place Optimization") tries
to optimize the design timing by buffering long wires, resizing cells,
restructuring logic etc.Explain how these IPO steps affect the quality of the
design in terms of area, congestion, timing slack.
(a) Why is this called "In-Place Optimization" ?
(b) Why are the two IPO steps different ?
(c) Why are both used ?
IPO optimizes timing by buffer insertion and cell resizing. Important
steps that are performed in IPO include fixing {setup,hold} time, max.
transition time violation. Timing slack along all arcs is optimized by these
operations. Increase in area and reduction in timing slack depend upon timing
and IPO constraints.
(a) This step is referred to as "In-Place Optimization" because IPO
performs resizing and buffer in-place (between cells in the row). It does not
perform placement optimization in this step.
(b) The first IPO1 step is performed after placement.
It performs
trial-route--> extraction --> timing analysis to determine the quality of
placement. Setup and hold time fixing is done according to result of timing
analysis.
The second IPO step is
performed after clock tree synthesis.
CTS performs clock
buffer insertion to balance skews among all flip-flops. IPO2 step optimizes
timing paths between flip-flops taking the actual clock skew.
(c) If IPO2 step is not performed after CTS, then timing paths between
flip-flops are not tuned for clock skew variation. Even though NanoRoute
performs timing optimization, it is more of buffer insertion in long
interconnect to fix setup and hold times.
Q8: Clocking and Place-Route Flow. Consider the following steps:
- Clock sink placement
- Standard-cell global placement
- Standard-cell detailed placement
- Standard-cell ECO placement
- Clock buffer tree construction
- Global signal routing
- Detailed signal routing
- Bounded-skew (balanced) clock (sub)net routing
- Steiner clock (sub)net routing
- Clock sink useful skew scheduling (i.e., solving the linear program, etc.)
- Post-placement (global routing based) static timing analysis
- Post-detailed routing static timing analysis
(a) As a designer of a clock distribution flow for high-performance
standard-cell based ASICs, how would you order these steps? Is it possible to
use some steps more than once, others not at all (e.g., if subsumed by other
steps).
(b) List the criteria used for assessing possible flows.
(c) What were the 3 next-best flows that you considered (describe as variants
of your flow), and explain why you prefer your given answer.
(a) My basic flow:
(1) SC global placement
(2) post-placement STA
(3) clock sink useful-skew scheduling
(4) clock buffer tree construction that is useful-skew aware (cf. associative
skew.)
(5) standard-cell ECO placement (to put the buffers into the layout)
(6) Steiner clock subnet routing at lower levels of the clock tree (following
CTGen type paradigm)
(7) bounded-skew clock subnet routing at all higher levels of the clock tree,
and as necessary even at lower levels, to enforce useful skews
(8) global signal routing
(9) detailed signal routing,
(10) post-detailed routing STA
(b)Criteria:
(1) likelihood of convergence with maximum clock frequency
(2) minimization of CPU time (by maximizing incremental steps, minimizing
.detailed. steps,
and minimizing iterations)
(3) make a good trade-off between wiring-based skew control and wire cost (this
suggests Steiner routing at lower levels, bounded-skew routing at higher
levels).
[Comment 1. Criteria NOT addressed: power, insertion delay, variant flow for
hierarchical clocking or gated clocking.
Comment 2: I do not know of any technology for clock sink placement that can
separate this from placement of remaining standard cells. So, my flow does not
invoke this step. I also don't want post-route ECOs.]
(c) Variants:
(1) introduce Step 11: loop over Steps 3-10 (not adopted because cost benefit
ratio was not attractive, and because there is a trial placement + global
routing to drive useful-skew scheduling, buffer tree construction and ECO
placement);
(2) after Steps 1-4, re-place the entire netlist (global, detailed placement)
and then skip Step 5 (not adopted because benefits of avoiding ECO placement
and leveraging a good clock skeleton were felt to be small-buffer tree will
largely reflect the netlist structure, and replacing can destroy assumptions
made in Steps 3-4);
(3) can iterate the first 5 steps essentially by iterating: clock sink
placement, (ECO placement for legalization), (incremental) standard-cell
(global + detailed) placement (not adopted because I feel that any objective
for standalone clock sink placement would be very "fuzzy", e.g.,
based on sizes of intersections of fan-in/fan-out cones of sequentially
adjacent FFs)
Q9: If we migrate to the next technology node and double the gate count of a
design, how would you expect the size of the LEF and routed DEF files to
change? Explain your reasoning.
The LEF file will remain roughly the same size (same richness of cell library,
say, between 500-1200 masters), modulo possible changes in conventions (e.g.,
CTLF used to be a part of LEF) and modulo possible additional library model
semantics (e.g., adding power modeling into LEF).
The DEF file should at
least double (the components and nets will double, but if there is extra
routing complexity (more complex geometries, and more segments per connection
due to antenna rules or badly scaling router heuristics) the DEF could grow
significantly faster.
Prime Time
PrimeTime is a full chip static analysis tool that can fully analyze a
multimillion gate ASIC in a short
amount of time.
PrimeTime needs four types of files before you can run it:
1. Netlist file: Verilog, VHDL, EDIF
2. Delay file: SPEF(standard parasitic format, it's from STARRC or
place&route tool), SPF, SDF(standard delay format)
3. Library file: DB ( From library vendors)4. Constrains file:
Synopsys Design Constraints(SDC) include 3 min requirement, clock, input delay
and output delay
use tcl( Tool command language) whenever possible.
PrimeTime will check the following violations:
1.Setup violations:
The logic is too slow compare to the clock.With that in mind there
are several things a designer can do to fix the setup violations.
Reduce the amount of buffering in the path.
Replace buffers with 2 inverters place farther apart
Reduce larger than normal capacitance on a book’s output pin
Increase the size of books to decrease the delay through the book.
Make sure clock uncertainty is not to large for the technology
library that you
are using.
Reduce clock speed. This is a poor design technique and should be
used as a
last resort.
2. hold time violations: the logic is too fast.
To fix hold violations in the design, the
designer needs to simply add more delay
to the data path. This can be done by
Adding buffers/inverter pairs/delay cells to the data path.
Decreasing the size of certain books in the data path. It is
better to reduce the books closer to the capture flip flop because there is
less likely hood of affecting other paths and causing new errors.
Add more capacitance to the output pin of books with light
capacitance.
Fix the setup time violation first, and then hold time violation.
If hold violations are not fixed before
the chip is made, more there is nothing that can be done post fabrication to
fix hold problems unlike setup violation where the clock speed can be reduced.
3. Transition Violations:
When a signal takes too long transiting from one
logic level to another, a transition violation is reported. The violation is a function
of the node resistance and capacitance.
The designer has two simple solutions to
fix the transitions violations.
Increase the drive capacity of the book to increase the voltage
swing or decrease the capacitance and resistance by moving the source gate
closer to sink gate.
Increase the width of the route at the violation instance pin.
This will decrease the resistance of the route and fix the transition violation
4. Capacitance Violations:
The capacitance on a node is a combination of
the fan-out of the output pin and
the capacitance of the net. This check ensures that the device does not drive
more
capacitance than the device is characterized for.
The violation can be removed by increasing the drive strength of
the book
By buffering the some of the fan-out paths to reduce the
capacitance seen by the output pin.
Conditions are used to check setup violation
WorstCase => setup violations
BestCase => hold violations
We use the worst case delay when testing for setup violations and then
we use the best case delay when testing for hold violations.
run PrimeTime in the unix?
[Linux] user@gmu>> pt_shell –f pt_script.tcl |& tee pt.log
Here are the sample PrimeTime script :
A total of three scripts must be created, one for each timing corner.
# ------------------------------------------------------------
# Library Declarations.
# ------------------------------------------------------------
set search_path ". /proj/timing/etc"
set link_path "*"
lappend link_path "stdCell_tt.db"
# ------------------------------------------------------------
# Read in Design
# ------------------------------------------------------------
# Read in netlist
read_file -f verilog top_level.v
# Define top level in the hierarchy
current_design "top_level"
# Combine verilog and db files and identify any errors.
link_design
# Read in SPEF file
read_parasitics -quiet -format SPEF top_level.spef.gz
# ------------------------------------------------------------
# Apply Constraints
# ------------------------------------------------------------
# Read in timing constraits
read_sdc -echo top_level.sdc
# Propagate clocks and add uncertainty to setup/hold calculations
set_propagated_clock [all_clocks]
set_clock_uncertainty 0.2 [all_clocks]
21
# ------------------------------------------------------------
# Time
# ------------------------------------------------------------
set_operating_conditions -min WORST -max WORST
# Register to Register
report_timing -from [all_registers -clock_pins] \
-to [all_registers -data_pins] -delay_type max \
-path_type full_clock –nosplit \
-max_paths 1 -nworst 1 \
-trans -cap -net > tc_reg2reg_setup.rpt
report_timing -from [all_registers -clock_pins] \
-to [all_registers -data_pins] -delay_type min \
-path_type full_clock –nosplit \
-max_paths 1 -nworst 1 \
-trans -cap -net > tc_reg2reg_hold.rpt
# Register to Out
report_timing -from [all_registers -clock_pins] \
-to [all_outputs] -delay_type max \
-path_type full_clock –nosplit \
-max_paths 1 -nworst 1 \
-trans -cap -net > tc_reg2out_setup.rpt
report_timing -from [all_registers -clock_pins] \
-to [all_outputs] -delay_type min \
-path_type full_clock –nosplit \
-max_paths 1 -nworst 1 \
-trans -cap -net > tc_reg2out_hold.rpt
# In to Register
report_timing -from [all_inputs]
-to [all_registers -data_pins] \
-delay_type max \
-path_type full_clock –nosplit \
-max_paths 1 -nworst 1 -trans \
–cap -net > tc_in2reg_setup.rpt
report_timing -from [all_inputs] \
-to [all_registers -data_pins] \
-delay_type min -path_type full_clock \
-nosplit -max_paths 1 -nworst 1 \
-trans -cap -net > tc_in2reg_hold.rpt
# All Violators – Find Cap/Tran Violations
# Summary of Setup/Hold Violations
report_constraints -all_violators > tc_all_viol.rpt
# Clock Skew
report_clock_timing -type skew -verbose > tc_clockSkew.rpt
exit
Q 1.What are the types in physical verification?
LVS
(layout vs schematic).
DRC
(design rule constrain check).
ERC
(electric rule check).
LEC
(logical equivalence check).
Q 2.How to fix setup and
hold violations at a time?
It is not possible to
fix both at a time because if we increase the delay in data path it's good for
hold and bad for setup.But there is only one way to fix it.
- Buffer the data path for hold
fix.
- Slow the clock frequency for
setup fix (this is not a valid fix,but we don't have other option).
Q 3.How can you avoid
cross-talk?
a) Increase the spacing
between the aggressor and victim nets.
b)
Shielding.
c)
Maintain the stable supply.
d)
Increase the drive strength of cell.
e)
Layer jumping.
f)
Victim net width increasing then resistance decreases.
g)
Guard ring.
h)
Cell sizing (up sizing).
Q 4.What is cross-talk?
It is the undesirable
electric interaction between two or more physical adjust nets due to the
capacitance cross coupling.When two nets are in parallel the electric field of
one net is effects the other net which is nearer to it.This called cross-talk
effect.
Q 5.What is scan chain
reordering?
. It is the process
of re connecting the scan chains in the design to optimize for routing by
reordering the scan chain connection which improves timing and congestion.
Q 6.What is the concept
of rows in the floor plan?
The std-cells in the
design are placed in rows.All rows have equal height and spacing.The width of
the row can vary.The std-cell in the row get the power and ground connection
from vdd and vss rails.Sometimes technology allows the rows to be flip.So they
can share the power and ground rails in vdd-vss-vdd patron.
Q 7.What are the
advantages of NDR's?
a) By applying the
double width we can avoid the EM.
b) By
applying double spacing we can avoid the cross-talk.
c)
Help's to avoid congestion at lower metal layer.
d)
Help's pin accessibility of std-cells .
Q 8.What is temperature
inversion?
At higher CMOS
technologies cell delay increases when temperature increases.But when you are
in lower technologies i.e below 65nm cell delay has inversely proportional to
temperature.
Q 9.In reg to reg path
if you have setup problem where will you insert buffer?
A. We can insert
buffer near to launch flop which decreases the transition time.Hence decreasing
the wire delay therefore overall delay will decrease.When arrival time will
decrease setup violations will reduce(required time-arrival time).
Q 10.What is
partitioning?
It is the process of
dividing the chip into small blocks this is done mainly to separate different
functional blocks and also make placement.routing easier.
Q 11.How can you reduce
dynamic power?
a) Reduce power supply
voltage.
b)
Reduce voltage swing in all nodes.
c)
Reduce the switching probability (transition factor).
d)
Reduce load capacitance.
Q 12. Why double via
insertion?
To reduce the yield loss
due to via failures,double via's are inserted traditionally double via's
where inserted in post route and then modify the routing to fix any DRC's.
Q 13.What is metal fill
insertion?
At the time of etching
they use some type of chemicals due to that chemical metal loss will be more
for that reaction we are inserting the metal fills.
Q 14.What is metal
slotting?
It is the Technic for
avoiding the problems like metal lift off and metal erosion.
Q 15.What are the power
dissipation components?
Dynamic power
consumption:- Occurs when signals which go through the CMOS circuit change
there logic state by charging discharging of o/p node capacitor.
static (leakage power
consumption):- It is the power
consumed by the sub threshold currents and by reverse biased diodes in a CMOS
transistor.
short circuit power
consumption:- It occurs during switching
on both the NMOS and PMOS transistors in the circuit and they conduct
simultaneously for a short amount of time.
Q 16.What is dishing
effect?
It is defined as the
difference between the height of the oxide in the spaces and that of the metal
in the trenches.It is caused by CMP.It may reduced by some dummy fill Technics
effectively.
Q 17.What is CMP
(chemical mechanical polishing)?
It is the process
of smoothing surface with the combination of chemical and mechanical forces.It
is used in IC fabrication to get a high level of polarization.
Q 18.What is the use of
placement blockage?
a) Defines std-cell and
macro area.
b)
Reserve channels for buffer insertion.
c)
Prevent cells from being placed at or near macros.
d)
Prevent congestion near macros.
Q 19.What are the types
of global routing?
a) Time driven global
routing.
b)
Cross-talk driven global routing.
c)
Incremental global routing.
Q 20.What are the
violations solved in LVS?
a) Shorts.
b)
Opens.
c)
Missing text layers.
d)
Missing lib in GDS.
e)
Missing soft layers.
Q 21.What is the clock
latency?
It is the delay between
the clock source and clock pin.It is two types.Clock source latency and clock
network latency.The time taken from clock source to definition pin is
the clock source latency and from the clock definition pin to clock pin of
the flip flop 2 is the clock network latency.
Q 22.How to fix setup
and hold violations?
A. Setup:-
- Reduce the amount of buffers in
the path.
- Replace buffers with 2
inverters.
- Replace HVT cells with LVT
cells.
- Increase the drive
size/strength.
- Insert repeaters.
- Adjust cell position in layout.
Hold:-
- By adding delay in data path.
- Decrease the drive strength in
data path.
Q 23.What are the inputs
of floor plan?
- .v
- .lib and .lef
- .sdc
- tlu+ file
- Physical partitioning
information of design.
- Floor plan parameters
like height,width,aspect ratio,utilization.
- Pin/pad position.
Q 24.What are the
outputs of floor plan?
- Die/block area.
- I/O pad placed.
- Macro placed.
- Power grid design.
- Power pre routing.
- Std-cell placement area.
Q 25.What is keep-out
margin?
It is the region around
the boundary of fixed macros in design in which no other macros or standard
cells not allows.It allows only buffers and inverters in it's area.
Q 26.How will you
synthesize clock tree?
a) Single clock-normal
synthesis and optimization.
b) Multiple
clocks-synthesis each clock separately.
c) Multiple
clocks with domine crossing synthesis each
clock separately and balance the
skew.
Q 27.What is IR drop?
Each metal layer has a
resistance value.When the current flows through the metal the resistance
consumes some current.This is the IR drop.If the resistance is more the drop
also more.
Q 28.how to reduce power
dissipation using HVT and LVT in the design?
If we have positive
slack use HVT cells in the path and use LVT cells in the path when we have
negative slack.HVT cells have large delay and less leakage power. LVT cells
have less delay and more leakage power.To meet the timing use
LVT cells and to reduce the leakage power use HVT cells.
Q 29.What is wire load
model (WLM)?
It is an estimation of
delay based on area and fan-out.The delay depend on..
Resistance.
Capacitance.
Area
of the nets.
Q 30.What is signal
integrity?
It is the ability of an
electric signal to carry information reliably and to resist the effects
(cross-talk, EM) of high frequency electromagnetic interface from near by
signals.
Q 31.Doe's cross-talk
always cause violations?
Yes it is because
cross-talk adds or subtracts energy to the signal which cause setup or hold
violations.
Q 32.How a positive or
negative edge triggered flip flop will effect the setup and hold violations?
Positive edge triggered
flip flop will favour to setup (setup violations will reduce).Negative edge
triggered flip flop will favour to hold (hold violations will reduce).
Q 33.What are the i/p's
and o/p's of power planing?
i/p's:-
- Data base with valid floor
plan.
- Power rings and power straps
width.
- Spacing between vdd and vss
straps.
o/p:-
- Design with power structure.
Q 34.What are the i/p's
and o/p's of placement?
I/P's:-
- Netlist.
- Mapped and floor planed design.
- Logical and physical lib.
- Design constraints.
O/P 's:-
- Physical layout information.
- Cell placement location.
- Physical layout,timing and
technical information of lib.
Q 35.If we increase the
fan-out of the cell how it will effects delay?
Fan-out lead to
increased capacitive load on the driving gate.Therefore longer propagation
delay.
Q 36.What is multi
driven nets?
It can be created in RTL
by introducing drivers of same or different signal strengths.However during a
net with multiple signals are not considered as a good practice.This could lead
to failure in a post silicon verification as the driver strength can
potentially get heavily altered during manufacturing defects.Many EDA tools
don't allow multi driven nets in the design and the designers are expected to
remove all multi driven nets from the design.
Q 37.What is magnetic
placement?
To improve the timing
for the design or to improve the congestion for a complex floor plan we can use
magnetic placement to specify fixed objects as magnets and icc moves their
connected standard cells close to them.For the best results perform the
magnetic placement before standard cells are placed.
Q 38.What is lookup
table?
The table is drawn by
using input transition and output load values.It is used to calculate the cell
delay.
Q 39.What does we do for
low power design?. We apply low power techniques
- Clock gating.
- Multi voltage design.
- Power gating.
- Multiple vt libraries.
Q 40.What are the types
of checks done in prime time?
a) Timing
(setup,hold,transition).
b)
Design constraints.
c)
Nets.
d)
Noise.
e)
Clock skew.
Q 41.What analysis we do
during floor plan?
a) Overlapping of
macros.
b)
Allowable IR drop.
c)
Global route congestion.
d)
Physical information of the design.
Q 42. What are the
different types of delay models?
a) WLM (wire load model)
b)
NLDM ( non linear delay model)
c) CCS
(composite current source)
Q 43.Where placement
blockage is created?
At floor plan stage it
acts like guidelines for placement of standard cells.In CTS stage in order to
balance the skew more no.of buffers and inverters are added and blockages
are used to reserve space for buffer and inverter.
Q 44.Why we apply NDR's
in placement?
Applying NDR's in
placement because of avoiding congestion and timing problem.These problems are
difficult to fix at routing.These are special rules like double spacing and
double width.
Q 45.What is mesh?
The horizontal and
vertical power straps in the design are called mesh.
Q 46.Why I/O cells are
placed in the design?
The i/o cells are the
one which interact in between the blocks outside of the chip to internal blocks
of the chip.In floor plan stage i/o cells are placed in between core and
die.These are responsible for providing voltage to the cell in the core.
Q 47.What are the
complex cells in the floor plan?
These are the cells
which are made of group of std-cells based on functionality requirement.This
cells height is grater than the std-cells and lesser than the macros.
Q 48.How to fix
Electromigration (EM)?
. a) Down size the
driver.
b)
Increase the metal width.
c) Add
more vias.
d)
Spread cells.
Q 49.What is etching?
It is used in
micro-fabrication to chemically remove layers from the surface of the wafer
during manufacturing.
Q 50.What is SOI
technology?
It refers to use of
layered silicon insulator.It reduces leakage current and lower power
consumption.
Q 51.What is aggressor
and victim?
these two terms will
come in cross-talk concept.
Aggressor:- A
net which create the effect on nearer net(victim).
victim:- A
net which receives the effect from nearer net(aggressor).