Clock Tree Synthesis
Hi Readers,
This article will be a little technical. It’s about CTS.
It is one of the important steps in the Physical Design Flow. It’s the process of building a robust clock tree. i.e it’s the process of balancing the clock skew and minimising the insertion delay.
Clock nets in any chip are the most switching. They can be nasty and couple with neighbouring nets and cause failures in functionality. Hence, they need special care.
There are many articles that describe about the settings/optimisation scripts/sanity checks etc. You can read about them here →
However, in this article I would like to highlight some of the processes that happen inside the tool while executing the CTS node.
What happens @ CTS ?
- Clocks are propagated.
- NDR’s are applied.
- Honours exceptions like false path, multi-cycle path & don't touch attributes etc.
- Generated clocks are taken care.
- Clock Tree is built with the cell list that is provided.
- OCV derates impact timing from non-common clock path
- Clock gate enable timing is no longer ideal.
- Inter-clock timing depends on achievable insertion delays.
- Clock generator control logic timing is no longer ideal.
Anchor Buffer/Transport Buffer → It’s a strong buffer (capable of driving a long net). It is usually added at the beginning of the clock tree i.e near the clock pin.
Skew Anchor/Sink Point → It’s a clock endpoint that controls the downstream clock tree. for eg. a register that is a div2 clock generator has a clock pin as a skew anchor because the arrival time of the clock at that clock_pin affects the arrival times of all clocks after that.
Managing clock skew better often reduces hold failures in the design. Generally, for every 100K gates, 600–650 clock buffers are added. Building the clock tree with only inverters yields better DCD (Duty Cycle Distortion).
GLOBAL SKEW → Its the difference between the longest insertion delay and the shortest insertion delay.
(Skew = buffer delay + RC delay + delay due to process effects)
create_clock command →
- Every create_clock constraint becomes the starting point for insertion delay calculation in propagated mode.
- Every create_clock constraint triggers a new clock tree during CTS building.
- A create_clock constraint defined on the hierarchical pin is not supported during CTS.
- Clock logic is not to be synthesized during Synthesis stage.
- create_generated_clock enables insertion delay calculation from the master clock unlike the create_clock command.
What does Meeting Skew Targets mean?
Its making the skew almost equal from clock source point to all the flop end points.
Clock Tree Optimisation →
- Upsize/Downsize buffers/inverters/comb cells/ seq cells.
- Limiting the number of buffers and inverters used.
- Level Adjustment.
- Delay Insertion.
- Dummy Load Insertion.
CTS INTERNAL FLOW →
- Initialisation. (Library trimming, identify place-able area, validate transition and skew targets)
- Construction. (Clustering, Legalisation DRV Repair)
- Implementation. (Optimise Insertion Delay, power and skew balancing)
- EGR Post Conditioning. (early global route, area reclaim, DRV repair)
- Clock Routing. (detail routing along global route guides)
- NR Post Conditioning. (DRV and skew repair)
- Post CTS Optimisation. (scan re-ordering, data-path optimisation and use of useful skew)
Edit —
CTS Problems —
- Clock Skew.
- Long Insertion Delay.
- Skew across clocks.
- Heavy clock net loading.
- Crosstalk and EM.
CTS Analysis —
- Report Timing.
- Report ID, skew, targets.
- DRV Report.
- CTS Exceptions.
- Clock QoR.
- Report DRV, Power and Area.
Things to be noted —
- Check “don’t use” and “don’t touch” attributes.
- Check “don’t buffer” net.
- Check “don’t size” and “size only” cells.
- Check for “clock exceptions”
Disclaimer → These ideas have been gathered from various resources from the web. I do not own them nor am I responsible for the correctness.
-Sethupathi Balakrishnan