Chapter 6
Pipelining Improve performance by increasing instrction throghpt Program eection order Time (in instrctions) lw $, ($) Instrction fetch 2 4 6 8 2 4 6 8 ALU Data access lw $2, 2($) 8 ns Instrction fetch ALU Data access lw $3, 3($) Program eection Time order (in instrctions) lw $, ($) lw $2, 2($) Instrction fetch 2 ns 8 ns 2 4 6 8 2 4 Instrction fetch ALU Data access ALU Data access Instrction fetch 8 ns... lw $3, 3($) 2 ns Instrction fetch ALU 2 ns 2 ns 2 ns 2 ns 2 ns Ideal speedp is nmber of stages in the pipeline. Do we achieve this? Data access Electrical & Compter Engineering THE COLLEGE OF NEW JERSEY 2
Pipelining What makes it easy all instrctions are the same length jst a few instrction formats operands appear only in loads and stores What makes it hard? strctral hazards: sppose we had only one control hazards: need to worry abot branch instrctions hazards: an instrction depends on a previos instrction Electrical & Compter Engineering THE COLLEGE OF NEW JERSEY 3
Pipelining We ll bild a simple pipeline and look at these isses We ll talk abot modern processors and what really makes it hard: eception handling trying to improve performance with ot-oforder eection, etc. Electrical & Compter Engineering THE COLLEGE OF NEW JERSEY 4
Basic Idea IF: Instrction fetch ID: Instrction decode/ register file read EX: Eecte/ address calclation E: emory access : back 4 reslt Shift left 2 PC ress Instrction Instrction register register 2 isters 2 register Zero ALU ALU reslt ress Data 6 Sign etend 32 What do we need to add to actally split the path into stages? Electrical & Compter Engineering THE COLLEGE OF NEW JERSEY 5
Pipelined Datapath IF/ID ID/EX EX/E E/ 4 reslt Shift left 2 PC ress Instrction Instrction register register 2 isters 2 register Zero ALU ALU reslt ress Data 6 Sign etend 32 Can yo find a problem even if there are no dependencies? What instrctions can we eecte to manifest the problem? Electrical & Compter Engineering THE COLLEGE OF NEW JERSEY 6
Corrected Datapath IF/ID ID/EX EX/E E/ 4 reslt Shift left 2 PC ress Instrction Instrction register register 2 isters 2 register Zero ALU ALU reslt ress Data 6 Sign etend 32 Electrical & Compter Engineering THE COLLEGE OF NEW JERSEY 7
Graphically Representing Pipelines Program eection order (in instrctions) lw $, 2($) Time (in clock cycles) CC CC 2 CC 3 CC 4 CC 5 CC 6 I ALU D sb $, $2, $3 I ALU D Can help with answering qestions like: how many cycles does it take to eecte this code? what is the ALU doing dring cycle 4? se this representation to help nderstand paths Electrical & Compter Engineering THE COLLEGE OF NEW JERSEY 8
Pipeline Control PCSrc IF/ID ID/EX EX/E E/ 4 Shift left 2 reslt Branch PC ress Instrction Instrction register register 2 isters 2 register Instrction [5 ] 6 Sign 32 etend ALUSrc 6 ALU control Zero ALU ALU reslt ress em Data em emto Instrction [2 6] Instrction [5 ] ALUOp Dst Electrical & Compter Engineering THE COLLEGE OF NEW JERSEY 9
Pipeline control We have 5 stages. What needs to be controlled in each stage? Instrction Fetch and PC Increment Instrction Decode / ister Fetch Eection emory Stage Back Electrical & Compter Engineering THE COLLEGE OF NEW JERSEY
Pipeline control How wold control be handled in an atomobile plant? a fancy control center telling everyone what to do? shold we se a finite state machine? Electrical & Compter Engineering THE COLLEGE OF NEW JERSEY
Pipeline Control Pass control signals along jst like the Eection/ress Calclation stage control lines emory access stage control lines stage control lines Instrction Dst ALU Op ALU Op ALU Src Branch em em write em to R-format lw sw X X beq X X Instrction Control EX IF/ID ID/EX EX/E E/ Electrical & Compter Engineering THE COLLEGE OF NEW JERSEY 2
Datapath with Control PCSrc Control ID/EX EX/E E/ IF/ID EX PC 4 ress Instrction Instrction register register 2 isters 2 register Shift left 2 reslt ALUSrc Zero ALU ALU reslt Branch em ress Data emto Instrction 6 32 [5 ] Sign etend 6 ALU control em Instrction [2 6] Instrction [5 ] Dst ALUOp Electrical & Compter Engineering THE COLLEGE OF NEW JERSEY 3
Dependencies Problem with starting net instrction before first is finished dependencies that go backward in time are hazards Time (in clock cycles) Vale of register $2: Program eection order (in instrctions) sb $2, $, $3 CC CC 2 CC 3 CC 4 CC 5 CC 6 CC 7 CC 8 CC 9 / 2 2 2 2 2 I D and $2, $2, $5 I D or $3, $6, $2 I D add $4, $2, $2 I D sw $5, ($2) I D Electrical & Compter Engineering THE COLLEGE OF NEW JERSEY 4
Software Soltion Have compiler garantee no hazards Where do we insert the nops? sb $2, $, $3 and $2, $2, $5 or $3, $6, $2 add $4, $2, $2 sw $5, ($2) Problem: this really slows s down! Electrical & Compter Engineering THE COLLEGE OF NEW JERSEY 5
Forwarding Use temporary reslts, don t wait for them to be written register file forwarding to handle read/write to same register ALU forwarding Time (in clock cycles) CC CC 2 CC 3 CC 4 CC 5 CC 6 CC 7 CC 8 CC 9 Vale of register $2 : / 2 2 2 2 2 Vale of EX/E : X X X 2 X X X X X Vale of E/ : X X X X 2 X X X X Program eection order (in instrctions) sb $2, $, $3 I D and $2, $2, $5 I D or $3, $6, $2 I D add $4, $2, $2 I D sw $5, ($2) I D what if this $2 was $3? Electrical & Compter Engineering THE COLLEGE OF NEW JERSEY 6
Forwarding ID/EX EX/E Control E/ IF/ID EX PC Instrction Instrction isters ALU Data IF/ID.isterRs Rs IF/ID.isterRt Rt IF/ID.isterRt IF/ID.isterRd Rt Rd EX/E.isterRd Forwarding nit E/.isterRd Electrical & Compter Engineering THE COLLEGE OF NEW JERSEY 7
Can't always forward Load word can still case a hazard: an instrction tries to read a register following a load instrction that writes to the same register Program eection order (in instrctions) lw $2, 2($) Time (in clock cycles) CC CC 2 CC 3 CC 4 CC 5 CC 6 I D CC 7 CC 8 CC 9 and $4, $2, $5 I D or $8, $2, $6 I D add $9, $4, $2 I D slt $, $6, $7 I D Ths, we need a hazard detection nit to stall the load instrction Electrical & Compter Engineering THE COLLEGE OF NEW JERSEY 8
Stalling We can stall the pipeline by keeping an instrction in the same stage Program Time (in clock cycles) eection order (in instrctions) CC CC 2 CC 3 CC 4 CC 5 CC 6 CC 7 CC 8 CC 9 CC lw $2, 2($) I D and $4, $2, $5 I D or $8, $2, $6 add $9, $4, $2 I I D bbble I D slt $, $6, $7 I D Electrical & Compter Engineering THE COLLEGE OF NEW JERSEY 9
Hazard Detection Unit Stall by letting an instrction that won t write anything go forward Hazard detection nit ID/EX.em ID/EX IF/ID IF/ID Control EX EX/E E/ PC PC Instrction Instrction isters ALU Data IF/ID.isterRs IF/ID.isterRt IF/ID.isterRt IF/ID.isterRd ID/EX.isterRt Rt Rd Rs Rt Forwarding nit EX/E.isterRd E/.isterRd Electrical & Compter Engineering THE COLLEGE OF NEW JERSEY 2
Branch Hazards When we decide to branch, other instrctions are in the pipeline! Program eection order (in instrctions) Time (in clock cycles) CC CC 2 CC 3 CC 4 CC 5 CC 6 CC 7 CC 8 CC 9 4 beq $, $3, 7 I D 44 and $2, $2, $5 I D 48 or $3, $6, $2 I D 52 add $4, $2, $2 I D 72 lw $4, 5($7) I D We are predicting branch not taken need to add hardware for flshing instrctions if we are wrong Electrical & Compter Engineering THE COLLEGE OF NEW JERSEY 2
Flshing Instrctions IF.Flsh Hazard detection nit ID/EX EX/E Control E/ IF/ID EX PC 4 Instrction Shift left 2 isters = ALU Data Sign etend Forwarding nit Electrical & Compter Engineering THE COLLEGE OF NEW JERSEY 22
Improving Performance Try and avoid stalls! E.g., reorder these instrctions: lw $t, ($t) lw $t2, 4($t) sw $t2, ($t) sw $t, 4($t) Electrical & Compter Engineering THE COLLEGE OF NEW JERSEY 23
Improving Performance a branch delay slot the net instrction after a branch is always eected rely on compiler to fill the slot with something sefl Sperscalar: start more than one instrction in the same cycle Electrical & Compter Engineering THE COLLEGE OF NEW JERSEY 24
Dynamic Schedling The hardware performs the schedling hardware tries to find instrctions to eecte ot of order eection is possible speclative eection and dynamic branch prediction Electrical & Compter Engineering THE COLLEGE OF NEW JERSEY 25
Dynamic Schedling All modern processors are very complicated DEC Alpha 2264: 9 stage pipeline, 6 instrction isse PowerPC and Pentim: branch history table Compiler technology important Electrical & Compter Engineering THE COLLEGE OF NEW JERSEY 26
Dynamic Schedling This class has given yo the backgrond yo need to learn more Video: An Overview of Intel s Pentim Processor Electrical & Compter Engineering THE COLLEGE OF NEW JERSEY 27