DEPARTMENT OF INFORMATION TECHNOLOGY IDLab ADE Assembler Flow for Rapid Design of High-Speed Low-Power Circuits Wouter Soenen, Bart Moeneclaey, Xin Yin and Johan Bauwelinck
High-speed and low-power circuit design: a challenging task to meet future data traffic needs while reducing ecological footprint Building Distribution Frame room at Facebook s Altoona data center. (Photo: @2014 Jacob Sharp Photography) http://www.ethernetalliance.org/roadmap/ https://arnoudm.wordpress.com/2009/01/13/ datacenter-power-consumtion/ 2
Circuit performance depends on many factors leading to exhaustive iterations between schematic and layout Traditional design flow Layout parasitics Topology Stimuli and load conditions Circuit performance Statistical variations PVT corners 3
Variation-aware design flow reduces design time but still insufficient for high-speed circuits Variation-aware design T. McConaghy, K. Breen, J. Dyck, and A. Gupta, "Variation-Aware Design of Custom Integrated Circuits: A Hands-on Field Guide", Springer-Verlag New York, pp. 10,70-71, 2013 4
Parasitic and variation-aware design flow (PVAD) supported by Virtuoso ADE Assembler can significantly reduce time-to-market Parasitic-aware design Variation-aware design 5
Outline PVAD flow illustrated on 40 Gb/s flip-flop cascade using ADE Assembler Initial design and specifications Creating a parasitic estimated view Corner extraction Size over corners Fast corner verification Layout and extraction Yield verification Conclusion 6
Initial design and specifications 7
The parasitic and variation-aware design flow is applied on a flip-flop cascade part of a PAM-4 laser driver with a 4-tap equalizer flip-flop cascade under test W. Soenen, R. Vaernewyck, X. Yin, S. Spiga, and M. Amann, 56 Gb/s PAM-4 Driver IC for Long-Wavelength VCSEL Transmitters, 42nd European Conference on Optical Communication, Dusseldorf, Germany, 2016, pp. Th.1.C.4 8
Each flip flop requires a differential data and clock input, a reference current and provides a differential data output for the next flip flop and a buffered data output for the equalizer tap driver 9
Each flip-flop consists of two current-mode logic data latches using SiGe bipolar junction transistors 10
vo2_amplitude Data-to-clock delay (tdq) of 1 st latch and clock-to-output delay (tcq2) and output swing (vo2) of 2 nd flip-flop is monitored in transient simulation Tbit vol1 vo2 DATA vo1 vo2 CLK DATA tdq tcq2 CLK vol1 11
The PVAD flow requires specifications that capture the overall performance of the circuit test high-speed performance test transit time of transistors to avoid under sizing when optimizing the circuit 12
Create initial design that meets all specifications across VT corners VT corners range from 27 C to 105 C and allow ±10% variation on 2.5V supply voltage 13
Creating a parasitic estimated view 14
Parasitic resistance and capacitances are added based on experience and build into an estimated view 15
Layout EAD can be a great assist in estimating the parasitics by performing an extraction on a preliminary layout 16
Spec Comparison provides a nice overview of the impact of parasitics on the specifications schematic vs. estimated 17
Corner extraction 18
Extract 3-sigma corners from specifications for worst-case VT corner Create statistical corners will automatically generate corners from plotted specifications K-Sigma Corners: ~200 samples required for 3-sigma extraction 19
Be aware that the corner extraction can still be inaccurate if the PDF of the output deviates from a normal distribution tdq tcq2 vo2_amplitude 20
Generated 3-sigma corners display the dependence of each specification on the statistical parameters tcq2 tdq vo2_amplitude depends on statistical variations of the resistors depends on statistical variations of the resistors and transistors 21
Why user-defined process corners are not optimal/preferred for custom analog IC design Corner files related to physical components or process steps Totals at 96 PVT corners 2 worst-case corners derived from simulation 22
The user-defined corners are clearly over estimating the 3-sigma performance compared to the extracted corners user-defined corners 3-sigma corners +35% 23
Size over corners 24
First step is to parameterize the circuit elements and match them if required: resistors, transistors and current sources 25
Provide a parameter range to be used by the optimizer or for manual sweeps initial parameters optimizer parameter ranges 26
Optimization over multiple corners is most efficiently executed by the Size Over Corners ADE Assembler tool Uses global optimizer by default that can be proceeded with a final local optimization for open-ended specifications A primary condition for this optimizer is that all specs are met for the nominal case Saves simulation time by optimizing identified worst-case corners instead of optimizing over all corners per size iteration Can start from a setup state or let the optimizer do an initial sizing run 27
Assist the optimizer by adding different weights to the important specifications 3-sigma corners tcq2 and vo2_amplitude are most important for the functionality bigger weighting factor than tdq 28
Optimization can take a long time and is tedious to set up for convergence in the desired direction. A final manual tuning step can be necessary. initial parameters final parameters The optimized circuit does not deviate much from the initial design indicating that a local or manual optimization could have lead to a faster solution. 29
Some specifications need to be relaxed in terms of yield in order to trade off high-speed with low-power operation 3-sigma corners 30
Fast corner verification 31
A strong interaction between statistical parameters and design parameters requires re-extracting the 3-sigma corners. new 3-sigma corners Since the optimized design did not deviate much from the initial design, the re-extracted 3-sigma corners correspond well with the original corners. The design is ready for layout. 32
Layout and extraction 33
Layout will be extracted using QRC with maximum RC parasitics 34
Extracted layout satisfies the 3-sigma corners and is less impacted by layout parasitics as predicted by the estimated view 3-sigma corners estimated layout 35
Yield verification 36
Yield verification through Sample Reordering reduces the number of required simulations A 3-sigma yield for vo2_amplitude is not achieved according to the model. This will be checked by running a Monte Carlo with a fixed number of samples. 37
Monte Carlo running 1400 samples on the worst case VT corner to determine the final yield Worst case sample for vo2_amplitude matches very well to the Sample Reordering algorithm and only required 111 runs instead of 1400 to verify the yield. For this example, a 99.5% yield is sufficient for sign-off. 38
Conclusion 39
Conclusion By following a parasitic and variation-aware design flow, the amount of design iterations can be significantly reduced Assisted by the Virtuoso ADE suite, the PVAD flow results in a yield-verified optimized design while restricting simulation resources Effectiveness of optimization governed by a 17% reduction in power consumption if circuit would be implemented in PAM-4 driver IC from example Many ADE tools not covered here, further contribute to a comprehensive verification of the design 40
Wouter Soenen PhD student DEPARTMENT OF INFORMATION TECHNOLOGY E wouter.soenen@ugent.be T +32 9 264 33 28 Ghent University @ugent Ghent University www.ugent.be