Tolerant Processor in 0.18 µm Commercial UMC Technology

Similar documents
Radiation Hardening By Design

TKK S ASIC-PIIRIEN SUUNNITTELU

Level and edge-sensitive behaviour

Single Event Effect Mitigation in Digital Integrated Circuits for Space

Prototyping an ASIC with FPGAs. By Rafey Mahmud, FAE at Synplicity.

Single Event Upset Hardening by 'hijacking' the multi-vt flow during synthesis

Self Restoring Logic (SRL) Cell Targets Space Application Designs

System IC Design: Timing Issues and DFT. Hung-Chih Chiang

2.6 Reset Design Strategy

Lecture 23 Design for Testability (DFT): Full-Scan

Sequential circuits. Same input can produce different output. Logic circuit. William Sandqvist


nmos transistor Basics of VLSI Design and Test Solution: CMOS pmos transistor CMOS Inverter First-Order DC Analysis CMOS Inverter: Transient Response

HDL & High Level Synthesize (EEET 2035) Laboratory II Sequential Circuits with VHDL: DFF, Counter, TFF and Timer

Lecture 23 Design for Testability (DFT): Full-Scan (chapter14)

Innovative Fast Timing Design

Using on-chip Test Pattern Compression for Full Scan SoC Designs

Scan. This is a sample of the first 15 pages of the Scan chapter.

Synchronous Sequential Design

Design for Testability

System IC Design: Timing Issues and DFT. Hung-Chih Chiang

Asynchronous IC Interconnect Network Design and Implementation Using a Standard ASIC Flow

Testing Digital Systems II

Lessons Learned from FPGA Developments

A video signal processor for motioncompensated field-rate upconversion in consumer television

Clock Domain Crossing. Presented by Abramov B. 1

4. Formal Equivalence Checking

Soft Errors re-examined


EITF35: Introduction to Structured VLSI Design

EMPTY and FULL Flag Behaviors of the Axcelerator FIFO Controller

Modeling Latches and Flip-flops

L12: Reconfigurable Logic Architectures

Product Update. JTAG Issues and the Use of RT54SX Devices

Design Techniques for Radiation-Hardened FPGAs

Testability: Lecture 23 Design for Testability (DFT) Slide 1 of 43

LFSRs as Functional Blocks in Wireless Applications Author: Stephen Lim and Andy Miller

Outline. CPE/EE 422/522 Advanced Logic Design L04. Review: 8421 BCD to Excess3 BCD Code Converter. Review: Mealy Sequential Networks

Lecture #4: Clocking in Synchronous Circuits

EE178 Spring 2018 Lecture Module 5. Eric Crabill

Timing EECS141 EE141. EE141-Fall 2011 Digital Integrated Circuits. Pipelining. Administrative Stuff. Last Lecture. Latch-Based Clocking.

L11/12: Reconfigurable Logic Architectures

9. Synopsys PrimeTime Support

DEDICATED TO EMBEDDED SOLUTIONS

FPGA Design. Part I - Hardware Components. Thomas Lenzi

Why FPGAs? FPGA Overview. Why FPGAs?

Static Timing Analysis for Nanometer Designs

Digital Blocks Semiconductor IP

K.T. Tim Cheng 07_dft, v Testability

Impact of Test Point Insertion on Silicon Area and Timing during Layout

ECE 263 Digital Systems, Fall 2015

Figure 1: segment of an unprogrammed and programmed PAL.

Voter Insertion Techniques for Fault Tolerant FPGA Design.

DEPARTMENT OF ELECTRICAL &ELECTRONICS ENGINEERING DIGITAL DESIGN

At-speed Testing of SOC ICs

Self-Test and Adaptation for Random Variations in Reliability

IT T35 Digital system desigm y - ii /s - iii

FPGA Development for Radar, Radio-Astronomy and Communications

Using the Quartus II Chip Editor

Automated Verification and Clock Frequency Characteristics in CDC Solution

cascading flip-flops for proper operation clock skew Hardware description languages and sequential logic

DC Ultra. Concurrent Timing, Area, Power and Test Optimization. Overview

JRC ( JTAG Route Controller ) Data Sheet

Achieving Timing Closure in ALTERA FPGAs

Asynchronous & Synchronous Reset Design Techniques - Part Deux

Radiation Effects and Mitigation Techniques for FPGAs

Chapter 2 Clocks and Resets

Lab 3: VGA Bouncing Ball I

Digital to Mixed-Signal Verification of Power Management SOCs Using Questa-ADMS. M. Behaghel

Digital Systems Design

Lecture 6: Simple and Complex Programmable Logic Devices. EE 3610 Digital Systems

ASTRIX ASIC Microelectronics Presentation Days

Modeling Digital Systems with Verilog

Flip-flop and Registers

A pixel chip for tracking in ALICE and particle identification in LHCb

Modeling Latches and Flip-flops

EE178 Lecture Module 4. Eric Crabill SJSU / Xilinx Fall 2005

Report on 4-bit Counter design Report- 1, 2. Report on D- Flipflop. Course project for ECE533

Figure 1 shows a simple implementation of a clock switch, using an AND-OR type multiplexer logic.

Research on Precise Synchronization System for Triple Modular Redundancy (TMR) Computer

Synchronization Voter Insertion Algorithms for FPGA Designs Using Triple Modular Redundancy

VARIABLE FREQUENCY CLOCKING HARDWARE

Simulation Mismatches Can Foul Up Test-Pattern Verification

Design and analysis of microcontroller system using AMBA- Lite bus

Investigation of Look-Up Table Based FPGAs Using Various IDCT Architectures

Feedback Sequential Circuits

EECS150 - Digital Design Lecture 3 Synchronous Digital Systems Review. Announcements

UNIT IV CMOS TESTING. EC2354_Unit IV 1

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Sciences

Jin-Fu Li Advanced Reliable Systems (ARES) Laboratory. National Central University

Designs with Multiple Clock Domains: Avoiding Clock Skew and Reducing Pattern Count Using DFTAdvisor tm and FastScan tm

Using HERON modules with FPGAs to connect to FPDP

Laboratory 4. Figure 1: Serdes Transceiver

Boolean, 1s and 0s stuff: synthesis, verification, representation This is what happens in the front end of the ASIC design process

Laboratory 1 - Introduction to Digital Electronics and Lab Equipment (Logic Analyzers, Digital Oscilloscope, and FPGA-based Labkit)

11. Sequential Elements

Clock Networks in the ArcticLink Solution Platform

Overview: Logic BIST

Low-Power Decimation Filter for 2.5 GHz Operation in Standard-Cell Implementation

High Performance Microprocessor Design and Automation: Overview, Challenges and Opportunities IBM Corporation

Transcription:

The LEON-2 2 Fault- Tolerant Processor in 0.18 µm Commercial UMC Technology Microelectronics Presentation Days ESTEC, 4 5 February 2004 Roland Weigand European Space Agency Data Systems Division TOS-EDM Microelectronics Section Tel. +31-71-565-3298 Fax. +31-71-565-6791 Roland.Weigand@esa.int Steven Redant IMEC Leuven, Belgium Tel. +32-16-281928 Fax. +32-16-281584 redant@imec.be (1)

! Introduction and Objectives " Objectives " Project Timeline " Design Presentation! SEU Fault-Tolerance (FT) by Design Overview " EDACs and parity protection of memories " TMR Implementation in VHDL and netlist " Clock and reset triplication and clock edge spreading! Impacts of FT to the Design Flow " Simulation and Synthesis of a design with TMR inserted partially in VHDL or at netlist " Initialisation of gate level simulations " Timing issues due to the clock edge spreading " Scan insertion and scan testing of a multiple clock and clock-edges design! Packaging issues " Bonding of a very small die! Conclusion (2)

Objectives! Prove the efficiency of the SEU protection in LEON2 " LEON1 Demonstrator in 0.35 µm Atmel technology (<= May 2001) " Now added clock edge spreading and mapped to 0.18 µm UMC technology! Measure radiation behaviour of commercial library " Comparison to the Rad-Hard-By-Design (RHBD) library (same technology) " Total dose and latchup behaviour to be analysed! Provide early prototypes of the AT697 chip " Almost compatible (LEON2-1.0.4 + InSilicon PCI interface) " 100 MHz clock frequency - significant performance gain compared to FPGA " Development board has been designed and is available! Lessons learned an important outcome of the project " Experience with LEON and PCI interface in general " Lessons for FT implementation " Interfacing to the ASIC/MPW flow of IMEC/Europractice " Important know-how transfer to other projects (3)

Project Schedule (1)! Q1/2002: Initiation, design definition and VHDL design " Green light (budget) for activity given 6. March, proposal from IMEC! Q2/2002: Detailed Design, contract placement (13. June) " Selection and generation of Macrocells (PLL s, memories) " First netlist end of May, final pre-layout netlist and floorplan 9. August in several iterations, numerous issues had to be solved:» Netlist compatibility issues (Verilog naming rules in Synplify ASIC)» Change of source code (insertion of test pins, bugfixes of LEON 22. July)» Preparation of scan insertion interactively between IMEC and ESTEC! Q3/2002: Gate level simulation, layout, scan insertion and pattern generation " Severe initialisation problems at gate level simulation # forcing flip-flops necessary " Final pre-layout netlist with scan path ready 19. August " Place & route, clock tree synthesis, layout checks, several iterations performed by IMEC " Hold time fix and two reoptimisations facing Synopsys bugs " Final layout 19 September, tapeout released 9 October " Difficult package selection (small chip, high I/O count, suitable for radiation tests) " Scan pattern generation completed 18. October (after tapeout) (4)

Project Schedule (2)! Q4/2002: Manufacturing and test preparation " Test specification established together with IMEC " Test board preparation by IMEC and Microtest (Italy) " Functional test pattern generation and conversion " Samples delivered beginning of December, yet ongoing package discussions! Q1/2003: Bonding and Packaging " Problems bonding small die in large cavity: demetallisation " HCM (France) failed bonding, switch to AMIS (Pocatello, USA)! Q2/2003: Testing Q3/2003: Final Report and close of project " Testing at Microtest (Lucca, Italy) " Scan test abandoned identified hold problems in scan path " Functional tests affected by flip-flop forcing in gate simulation " Pattern mismatch due to (slow) pull-up s in the test board " A yield of 80% (40/50) was obtained on functional pattern @ 10 MHz " Additional clock speed characterisation confirms the 100 MHz target (5)

The Design! Chip Design Data " LEON2FT-1.0.4 with Meiko FPU " InSilicon PCI core with ESA wrapper " UMC 0.18 µm libraries from Virtual Silicon (VST): core, pad, memory, PLL " 14 Memories: 4 cache/tags, 2 two-port register files, 8 DSU memories 128x32 " 2 PLL: 33/33 MHz PCI, 25/100 LEON " 256 Pads (including 68 power pads) " 2x3 (TMR with edge spreading) clock domains (33 MHz PCI, 100 MHz LEON) " On-chip memory: 2.18 mm 2 = 200 kbit " Standard cells: 3.55 mm 2 = 290 kgates = 170 kg flip-flops + 120 combinatorial = 100 kg for PCI + 190 kg for LEON " Core area: 2.68 x 2.68 mm = 7.18 mm 2 " Chip size: 4.3 x 4.3 mm = 18.5 mm 2 " High share of flip-flops (PCI FIFOs!) " The chip is pad-limited! (6)

Layout and Chip Photo 4.3 mm (7)

SEU and SET Fault Tolerance! EDAC and parity protection protect against Single Event Upsets (SEU) " Used for internal and external memories impact on processor control! Triple Modular Redundancy Flip-Flops " Triplication and voting of every flip-flop in the design mitigates SEU (1) " Increasing importance of Single Event Transient (SET) in combinatorial logic " SET protection in voter logic by shifted feedback (2) (not implemented in LEONFT) " SET protection in clocks (and asynchronous resets) by triplication " LEON-FT: SET protection in all combinatorial logic by skewing the clock edges»delay δ between the clock trees technology dependent (SET pulse length)» Increases minimum clock period by 2δ» Risk of hold time problems» In : 0.5 ns fixed δ ~ 10% of the clock target (10 ns = 100 MHz)» In ATC697 use programmable δ??? (8)

TMR Flip-Flop Flop with enable (1) D M U X D3 en D1 D2 FF1 FF2 FF3 clk Single clock TMR Voted feedback Q1 Voter Q2 Q3 Q (9)

TMR Flip-Flop Flop with enable (2) D en M U X D1 M U X D2 D2 FF1 FF2 FF3 M U X clk Q1 Shifted feedback, protects SET in voter Voter Q2 Q3 US-Pat. 6637005 (Hughes) Q (10)

TMR Flip-Flop Flop with enable (3) D M U X D3 en D1 D2 FF1 FF2 FF3 clk δ δ clock tree 1 clock tree 2 Q1 clock tree 3 Q2 Triplicated clock tree Q3 and skewed clocks protecting against SET Voter Q δ ~ SET pulse length (11)

TMR Flip-Flop Flop Insertion! Native in VHDL-RTL source code " TMR can be instantiated or inferred " Mixed TMR and non-tmr RTL code requires resolution function for clocks entity DFF1_TMR is port ( clk : in std_logic_vector(2 downto 0) ; -- triplicated clock d : in std_logic; q : out std_logic ); end; -- One process per TMR Flip-flop rx0 : process(clk) begin if rising_edge(clk(0)) then r(0) <= d; end if; end process; rx1 : process(clk) begin if rising_edge(clk(1)) then r(1) <= d; end if; end process; rx2 : process(clk) begin if rising_edge(clk(2)) then r(2) <= d; end if; end process; -- Voting outputs q <= (r(0) and r(1)) or (r(0) and r(2)) or (r(1) and r(2));! At Gate Level " Preferred for third party IP s, facilitating maintenance of the source code " Library and synthesis tool dependent " Unique clock names in RTL source code " Synthesise netlist without TMR " Use package with equivalent TMR cells for all flip-flops used in the netlist " Edit netlist to triplicate clocks (including any clock buffers/inverters), instantiate TMR cells instead of library flip-flops " Carefully inspect edited netlist " Resynthesise the edited netlist sed -e 's/clk\(.*\) std_logic/clk\1 std_logic_vector(2 downto 0) /' -e 's/bufx\(.*\)invdl/bufx\1invdl_tmr/' -e 's/dff1 port map/dff1_tmr port map/' -e 's/dff2 port map/dff2_tmr port map/' netlist_notmr.vhd > netlist_tmr.vhd (12)

Simulation and Synthesis! Mixed RTL simulation of native TMR and non-tmr design " Definition of two clock types (triple and single), connection by a resolution function! 1 st Synthesis of non-tmr block and script-based TMR insertion to netlist " Overconstraining required to allow insertion of TMR voters " Inspection and resimulation of TMR inserted netlist in native TMR RTL code! Resynthesis of TMR inserted netlist in TMR RTL code " Retiming of TMR inserted netlist difficult # better use margin at 1 st synthesis " Conserve TMR in netlist, yet prune unused logic " Conserve triplicated clock nets and define three (virtual) clocks " Instantiated TMR flip-flops # several thousands of design units # critical (in Synopsys)» Selective flattening after elaboration: only flip-flops, not the design hierarchy» No relation between signal names and flip-flop instance names! Post-layout timing analysis and re-optimisation " Three clock trees per domain with clock delay cells instantiated " Carefully model the clock scenario # propagate clocks " Mandatory hold time fix after each re-optimisation step " Include scan path routing to hold time fix (13)

Gate Level Simulation! Initialisation of the LEON model with the UMC/VST libraries " Processor turns X few cycles after reset in timed or un-timed gate simulation " Not FT-related, reported by a university project using the same libraries " Library related problem never occurred on Atmel libraries " Investigation of modeling in VST Verilog models did not show apparent bugs " Problem remains unsolved (dirty) workaround:» # reset all flip-flops by simulator command before hardware reset " Leads to ambiguity in production test pattern» some flip-flops are initialised in simulation, but not in reality workaround:» # run two simulations, one with reset, one with preset, mask all differences! Initialisation of memories " General problem of all processor designs using on-chip memories " More critical with FT: EDAC affects processor control and facilitiates X propagation " # Initialise memory (Verilog) simulation models for netlist verification " # Execute memory initialisation program for test pattern generation! Asynchronous clock domains " RTL simulation with different clock frequencies (LEON 100 MHz, PCI 33 MHz) " Non deterministic at gate/hardware level (cycle slips, timing violations in synchronisers) " # Use equal (or integer multiple) clock frequencies for test pattern generation (14)

TMR Timing Issues d1a d3a d2a FF1 FF3 t setup t prop FF2 q2a q1a q3a Voter δ voter combinat. logic δ logic d1a d3a d2a FF1 FF2 FF3 q2a q1a q3a Voter δ voter clk clk1 δ clk2 δ clk3 Cycle Time T >= t prop + δ logic + t setup + δ voter + 2δ TMR voters and clock skewing reduce operating frequency (15)

Scan Path Insertion (wrong) si2 qa1 FFA2 si3 qa2 FFA3 t setup t prop qa3 hold violation si2 qb1 FFB2 si3 qb2 FFB3 t setup t prop qb3 si1 FFA1 si1 FFB1 clk clk1 δ clk2 δ clk3 Scan path routing across sub-clock domains $ hold violations (16)

Scan Path Insertion (right) si3 FFA3 t setup t prop qa3 --> sib3 FFB3 t setup t prop qb3 si2 FFA2 qa2 --> sib2 FFB2 qb2 si1 FFA1 qa1 --> sib1 FFB1 qb3 clk clk1 δ clk2 δ clk3 Better: one scan path per sub-clock domain may also simplify pattern generation (17)

Packaging and Bonding Ceramic package required for radiation tests: PGA-299 Despite a small cavity: Long bonding wires (18)

Conclusion! 1 st Silicon of the LEON2-FT in 0.18 µm UMC commercial technology " Functionally (but not pin) ~ compatible to AT697 -- Prototype board available " 100 MHz clock frequency target confirmed in production test and validation board " Power consumption ~ 5 mw/mhz power down mode inefficient (in this technology)! Lessons learned " Critical issues of the LEON processor (reset behaviour) " TMR Implementation in VHDL and netlist " Timing issues in a multiple (skewed) clock environment " Packaging and bonding of a very small die with high pin count! Basic SEU tests done in Californium environment " All memory SEU s corrected, no SEU errors in flip-flops detected " More in-depth testing should be performed " See next presentation (19)