Pipeline design. Mehran Rezaei

Size: px
Start display at page:

Download "Pipeline design. Mehran Rezaei"

Transcription

1 Pipeline design Mehran Rezaei

2 Shift Left 2 pc Opcode ExtOp Cont Unit RegDst Addr Addr2 Addr npcsle Reg ALUSrc Mem 2 OVF Branch ALUCtr MemtoReg Mem Funct Extension ALUOp ALU Cont

3 Shift Left 2 ID EXE MEM WB pc Addr Addr2 Addr 2 Extension IF

4 Shift Left 2 ID EXE MEM WB pc Addr Addr2 Addr 2 Extension IF

5 IF PC+ Inst. pc IF/ID Registers

6 ID IF/ID Registers PC+ Inst. Addr Addr2 Addr 2 PC+ RegA RegB IMM ID/EXE Registers Extension Rt Rd

7 Shift Left 2 EXE ID/EXE Registers PC+ RegA RegB IMM Rt Rd Br. Tr. Add. ALUres RegB Rt/Rd EXE/MEM Registers

8 MEM Br. Tr. Add. Mem ALUres EXE/MEM Registers RegB Rt/Rd ALUres Rt/Rd MEM/WB Registers

9 WB MEM/WB Registers Mem ALUres Rt/Rd

10 Shift Left 2 ID EXE MEM WB pc Addr Addr2 Addr 2 Extension IF

11 Example Run the following code on our pipeline machine add $,$0,$3 lw $,20($2) sub $5,$6,$6 sw $7,8($8) add $9,$,$3

12 Shift Left 2 pc add $,$0,$3 0 3 R0 R2 R R6 R R R3 R5 R7 R9 add $,$0,$3? 3 Extension

13 Shift Left 2 add $,$0,$3 pc Lw $,20($2) 2 R0 R2 R R6 R R R3 R5 R7 R9 0 5 lw $,20($2) 20? Extension 3

14 Shift Left 2 lw $,20($2) add $,$0,$3 pc Sub $5,$6,$6 6 6 R0 R2 R R6 R R R3 R5 R7 R9 8 5 sub $5,$6,$6 6 Extension 5 20?

15 Shift Left 2 sub $5,$6,$6 lw $,20($2) add $,$0,$3 pc sw $7,0($8) R0 R2 R R6 R R R3 R5 R7 R sw $7,8($8) Extension 6 5 5

16 Shift Left 2 sw $7,8($8) sub $5,$6,$6 lw $,20($2) add $,$0,$3 pc add $9,$,$3 R0 R2 R R6 R R R3 R5 R7 R Extension add $9,$,$3

17 Clk Next PC Recall: Single cycle control! Ideal Memory 32 Rd 5 Rs 5 Rw Ra Rt 5 Rb bit Registers A 32 B Control Control Signals ALU Conditions 32 In Ideal Memory Out Clk 32 Clk path

18 Stationary Control The Main Control generates the control signals during Reg/Dec Control signals for Exec (ExtOp, ALUSrc,...) are used cycle later Control signals for Mem (MemWr Branch) are used 2 cycles later Control signals for Wr (MemtoReg MemWr) are used 3 cycles later Reg/Dec Exec Mem Wr ExtOp ExtOp ALUSrc ALUSrc IF/ID Register Main Control ALUOp RegDst MemWr Branch MemtoReg ID/Ex Register ALUOp RegDst MemWr Branch MemtoReg Ex/Mem Register MemWr Branch MemtoReg Mem/Wr Register MemtoReg RegWr RegWr RegWr RegWr

19 Next PC PC Mem Acces s Mem Reg File Exec Reg. File Inst. Mem Decode path + Stationary Control IR fun rt rs op rs rt v rw wb me ex im v rw wb me Mem Ctrl v rw wb WB Ctrl A S M B D

20 Shift Left 2 pc Opcode ExtOp Cont Unit RegDst npcsle Reg ALUSrc Addr Addr2 Addr 2 Mem MemtoReg Mem OVF Branch ALUCtr Funct Extension ALUOp ALU Cont 20

21 Shift Left 2 ID EXE MEM WB pc Addr Addr2 Addr 2 Extension IF 2

22 Shift Left 2 ID EXE MEM WB pc Addr Addr2 Addr 2 Extension IF 22

23 Pipeline timing diagram add $,$0,$3 lw $,20($2) sub $5,$6,$6 sw $7,8($8) add $9,$,$3 IF ID EXE MEM WB IF ID EXE MEM WB IF ID EXE MEM WB IF ID EXE MEM WB IF ID EXE MEM WB 23

24 What are they? Hazards How do you detect them? How do you deal with them? 2

25 Shift Left 2 pc PC+ instruction PC+ vala valb target ALUres eq? mdata Extension IMM dest valb dest ALUres dest 25

26 Pipeline cycles for add IF - Fetch: read instruction from memory ID - Decode: read source operands from reg EXE - Execute: calculate sum MEM - Memory: pass results to next stage WB - back: write sum (ALUres) into register file 26

27 Hazard Register one is written add $,$2,$3 IF ID EXE MEM WB sub $,$5,$ IF ID EXE MEM WB Register one is read If we are not careful, we will read the wrong value! If sub is supposed to read updated value (not stale), how many instruction should be in between add and sub? 27

28 Shift Left 2 sub $,$5,$ add $,$2,$3 pc R0 R2 R R6 R R R3 R5 R7 R9 8 3 Extension 28

29 Hazard write add $,$2,$3 IF ID EXE MEM WB sub $,$5,$ IF hazard hazard ID EXE MEM WB read 29

30 Class work What are the data hazards in this piece of code? add $,$2,$3 sub $2,$,$3 xor $,$3,$5 nor $5,$2,$ add $5,$3,$5 30

31 What to do with them? Avoid Make sure there are no hazards in the code Detect and Stall If hazards exist, stall the processor until they go away. Detect and Forward If hazards exist, fix up the pipeline to get the correct value (if possible) 3

32 First Approach: avoid all hazards Assume the programmer (or the compiler) knows about the processor implementation. Make sure no hazards exist. Consider if I have an instruction called noop. Put noops between any dependent instructions. add $,$2,$3 noop noop sub $,$5,$ IF ID EXE MEM WB IF ID EXE MEM WB 32

33 What is the problem with this solution? Old programs (legacy code) may not run correctly on new implementations Longer pipelines need more noops Programs get larger as noops are included Especially a problem for machines that try to execute more than one instruction every cycle Intel EPIC: Often 25% - 0% of instructions are noops Program execution is slower CPI is, but some instructions are noops 33

34 The second solution Detect: Compare rega with previous DestRegs 5 bit operand fields Compare regb with previous DestRegs Stall: 5 bit operand fields Keep current instructions in fetch and decode Pass a noop to execute 3

35 Shift Left 2 pc PC+ instruction PC+ vala valb target ALUres eq? mdata Extension IMM dest valb dest ALUres dest 35

36 Shift Left 2 pc PC+ instruction PC+ vala valb target ALUres eq? mdata Extension IMM dest valb dest ALUres dest 36

37 Shift Left 2 pc PC+ instruction PC+ vala valb target ALUres eq? mdata Extension IMM valb ALUres 37

38 Hazard write Addr 0x00 add $,$2,$3 IF ID EXE MEM WB sub $,$5,$ IF hazard hazard ID EXE MEM WB read 38

39 0 Shift Left 2 First half of cycle 0x0 0x00 PC PC+ vala valb target ALUres eq? mdata Extension IMM valb ALUres add $,$2,$3 39

40 Shift Left 2 Second half of cycle add $,$2,$3 0x0 0x0 add $,$2,$ PC+ vala valb target ALUres eq? mdata Extension IMM valb ALUres 0

41 0 Shift Left 2 First half of cycle 2 0x08 0x0 0x0 add $,$2,$3 add $,$2,$ PC+ vala valb target ALUres eq? mdata Extension IMM valb ALUres sub $,$,$5

42 0 Shift Left 2 Second half of cycle 2 add $,$2,$3 0x08 0x08 sub $,$,$ x0 6 target ALUres eq? mdata Extension IMM valb ALUres sub $,$,$5 2

43 0 Shift Left 2 First half of cycle 3 0x0c add $,$2,$3 0x08 0x08 sub $,$,$5 Hazard detected x0 6 target ALUres eq? mdata Extension IMM 7 valb ALUres sub $,$,$5 3

44 Hazard detected compare compare compare compare rega regb REG file IF/ ID ID/ EX

45 Hazard detected compare rega regb

46 What Next? Detect: Compare rega with previous DestRegs 5 bit operand fields Compare regb with previous DestRegs Stall: 5 bit operand fields Keep current instructions in fetch and decode Pass a noop to execute 6

47 0 Shift Left 2 Second half of cycle 3 0x0c add $,$2,$3 0x08 0x08 sub $,$,$5 Hazard detected x0 7 eq? mdata Extension valb ALUres sub $,$,$5 noop 7

48 0 Shift Left 2 First half of cycle 0x0c add $,$2,$3 0x08 0x08 sub $,$,$5 Hazard detected x0 6 0x0 7 eq? mdata sub $,$,$5 Extension IMM noop valb 7 ALUres 8

49 0 Shift Left 2 Second half of cycle 0x0c add $,$2,$3 0x08 0x08 sub $,$,$5 Hazard detected eq? 0x0 mdata 7 Extension sub $,$,$5 noop noop 9

50 0 Shift Left 2 first half of cycle 5 0x0c 0x0c 0x08 sub $,$,$ eq? 0x0 mdata add $,$2,$3 sub $,$,$5 Extension noop noop 50

51 0 Shift Left 2 second half of cycle 5 0x0c 0x eq? mdata Extension sub $,$,$5 noop noop 5

52 Timing graph Time: add $,$2,$3 IF ID EX ME WB Sub $,$,$5 IF no op no op ID EX ME WB add $6,$,$7 IF ID EX ME WB lw $6,0($8) IF ID EX ME WB sw $6,3($) IF no op no op ID EX ME 52

53 Problems with the second solution Still CPI is the same as before, no improvement in performance The only improvement is in the code size, and no longer compiler is responsible to detect the data hazards In fact, now the system runs slower Why? 53

54 Detect the data hazard The third solution Add instruction calculated the result in the execution cycle Forward the result to the decode stage of the sub instruction Therefore sub does not need to wait until the result is written back into register file And more control is needed; place the result somewhere else rather than register file 5

55 The third solution Detect: same as detect and stall Except that all hazards are treated differently Forward: i.e., you can t logical-or the hazard signals New bypass datapaths route computed data to where it is needed New MUX and control to pick the right data Beware: Stalling may still be required even in the presence of forwarding 55

56 Shift Left 2 First half of cycle 3 sub $,$,$5 add $,$2,$3 pc PC+ sub $,$,$5 Hazard detected PC+ 6 7 target ALUres eq? mdata Extension IMM valb ALUres FW FW FW add $6,$,$7 56

57 Shift Left 2 End of cycle 3 sub $,$,$5 add $,$2,$3 pc PC+ Add $6,$,$ Extension PC+ 5 3 IMM target 7 eq? valb mdata ALUres FW FW H add $6,$,$7 57

58 Shift Left 2 First half of cycle add $6,$,$7 sub $,$,$5 add $,$2,$3 pc PC+ Add $6,$,$7 New Hazard Extension PC+ 5 3 IMM target 7 eq? valb 7 mdata ALUres lw $6,0($8) H FW FW 58

59 Shift Left 2 End of cycle add $6,$,$7 sub $,$,$5 add $,$2,$3 pc PC+ lw $6,0($8) PC+ 5 9 target eq? valb mdata Extension IMM 7 lw $6,0($8) H2 H FW 59

60 Shift Left 2 pc PC+ lw $6,0($8) First half of cycle 5 lw $6,0($8) New Hazard add $6,$,$7 PC+ 5 9 sub $,$,$5 target 6 eq? valb mdata add $,$2,$3 Extension IMM sw $6,3($) H2 H FW 60

61 What else can go wrong in our pipelined CPU? Control hazards Exceptions: First of all, what are exceptions? And, how do you handle exceptions in a pipelined processor with 5 instructions in flight?

62 Control Hazard What is a control hazard? How does the pipelined CPU handle control hazards?

63 Shift Left 2 beq bne pc PC+ PC+ vala valb target ALUres eq? mdata Extension IMM ALU Unit valb ALUres Control Unit

64 What happens in executing BEQ? Fetch: read instruction from memory Decode: read source operands from reg Execute: calculate target address and test for equality Memory: Send target to PC if test is equal back: Nothing left to do

65 Example y=y*2; x=0; for(j=00;j>0;j--){ x++; z--; } y--; x=x*3; z=z+x; 00 add $3,$3,$3 0 add $2,$0,$0 08 li $5,00 2 addi $2,$2, 6 addi $,$,- 20 addi $5,$5,- 2 bne $5,$0,- 28 addi $3,$3,- 32 add $5,$2,$0 36 add $2,$2,$2 0 add $2,$2,$5 add $,$,$2

66 What do you observe from the example? How many times the branch is taken? How many times is not taken? What happens each time that the branch instruction is executed? What happens next?

67 Surprise! 2 addi $2,$2,... 2 bne $5,$0,- 28 addi $3,$3,- 32 add $5,$2,$0 36 add $2,$2,$2 2 IF ID EXE MEM WB 28 IF ID EXE MEM WB 32 IF ID EXE MEM WB 36 IF ID EXE MEM WB 2 IF ID EXE MEM WB

68 Solutions Avoid Make sure there are no hazards in the code Detect and Stall Delay fetch until branch resolved. Speculate and Squash-if-Wrong Go ahead and fetch more instruction in case it is correct, but stop them if they shouldn t have been executed

69 Avoid Don t have branch instructions! Maybe a little impractical Delay taking branch: dbeq R,R2,offset dbne R,R2,offset s at PC+, PC+8, etc will execute before deciding whether to fetch from PC++offset. (If no useful instructions can be placed after dbeq, noops must be inserted.)

70 Consider our example again 00 add $3,$3,$3 0 add $2,$0,$0 08 li $5,00 2 addi $2,$2, 6 addi $,$,- 20 addi $5,$5,- 2 bne $5,$0,- 28 addi $3,$3,- 32 add $5,$2,$0 36 add $2,$2,$2 0 add $2,$2,$5 add $,$,$2 00 add $3,$3,$3 0 add $2,$0,$0 08 li $5,00 2 addi $2,$2, 6 addi $,$,- 20 addi $5,$5,- 2 bne $5,$0,- 28 noop 32 noop 36 noop 0 addi $3,$3,- add $5,$2,$0 8 add $2,$2,$2 52 add $2,$2,$5 56 add $,$,$2

71 Can we do better? 00 add $3,$3,$3 0 add $2,$0,$0 08 li $5,00 2 addi $5,$5,- 6 dbne $5,$0,-2 20 addi $,$,- 2 addi $2,$2, 28 noop 32 addi $3,$3,- 36 add $5,$2,$0 0 add $2,$2,$2 add $2,$2,$5 8 add $,$,$2 00 add $3,$3,$3 0 add $2,$0,$0 08 li $5,00 2 dbne $5,$0,- 6 addi $5,$5,- 20 addi $,$,- 2 addi $2,$2, 28 addi $3,$3,- 32 add $5,$2,$0 36 add $2,$2,$2 0 add $2,$2,$5 add $,$,$2 This code generates wrong results.

72 Problems with this solution Old programs (legacy code) may not run correctly on new implementations Longer pipelines need more instuctions/noops after delayed beq Programs get larger as noops are included Especially a problem for machines that try to execute more than one instruction every cycle Intel EPIC: Often 25% - 0% of instructions are noops Program execution is slower CPI equals, but some instructions are noops

73 Detect and Stall (hardware approach) Detection: Must wait until decode Compare opcode to beq Alternately, this is just another control signal Stall: Keep current instructions in fetch Pass noop to decode stage (not execute!)

74 Our example again 00 add $3,$3,$3 0 add $2,$0,$0 08 li $5,00 2 addi $2,$2, 6 addi $,$,- 20 addi $5,$5,- 2 bne $5,$0,- 28 addi $3,$3,- 32 add $5,$2,$0 36 add $2,$2,$2 0 add $2,$2,$5 add $,$,$2

75 Shift Left bne $5,$0,- PC+ vala valb target ALUres Eq? mdata Extension IMM ALU Unit valb ALUres bne $5,$0,- Control Unit

76 Shift Left 2 bne $5,$0,- pc 28 noop 28 0 target ALUres eq mdata 0 Extension IMM ALU Unit valb ALUres Control Unit

77 Shift Left 2 bne $5,$0,- pc 28 noop 28 vala valb target 0 eq mdata Extension IMM ALU Unit valb ALUres Control Unit noop

78 Shift Left bne $5,$0,- pc 28 noop 28 vala valb target 0 eq mdata Extension IMM ALU Unit valb ALUres Control Unit noop noop

79 Shift Left 2 pc 6 addi $2,$2, 28 vala valb target 0 eq mdata Extension IMM ALU Unit valb ALUres addi $2,$2, Control Unit noop noop noop

80 What seems to be the problem? CPI increases every time a branch is detected! Is that necessary? Not always! Only about ½ of the time is the branch taken Let s assume that it is NOT taken In this case, we can ignore the beq or bne (treat them like a noop) Keep fetching PC + What if we are wrong? OK, as long as we do not COMPLETE any instructions we mistakenly executed (i.e. don t perform writeback)

81 Speculate and Squash Speculate: assume not equal Keep fetching from PC+ until we know that the branch is really taken Squash: stop bad instructions if taken Send a noop to: Decode, Execute and Memory Send target address to PC

82 Our example again 00 add $3,$3,$3 0 add $2,$0,$0 08 li $5,00 2 addi $2,$2, 6 addi $,$,- 20 addi $5,$5,- 2 bne $5,$0,- 28 addi $3,$3,- 32 add $5,$2,$0 36 add $2,$2,$2 0 add $2,$2,$5 add $,$,$2

83 Shift Left 2 pc PC+ noop PC+ vala valb target ALUres eq? mdata 2 bne $5,$0,- 28 addi $3,$3,- 32 add $5,$2,$0 36 add $2,$2,$2 Extension Control Unit IMM noop ALU Unit valb noop ALUres

84 Performance problem, again CPI increases every time a branch is taken! About ½ of the time Is that necessary? No!, but how can you fetch from the target before you even know the previous instruction is a branch much less whether it is taken???

85 Shift Left bne $5,$0,- PC+ vala valb target ALUres Eq? mdata bpc target Extension IMM ALU Unit valb ALUres 2 bne $5,$0,- Control Unit

86 Shift Left PC PC+ PC vala valb target ALUres Eq? 2 mdata bpc target 2 2 Extension IMM ALU Unit valb ALUres Control Unit 2 bne $5,$0,-

87 Shift Left 2 eq? PC PC+ PC vala valb target ALUres Eq? PC mdata bpc target 2 2 Extension IMM ALU Unit valb ALUres 2 bne $5,$0,- Control Unit

88 Branch Prediction Predict not taken: ~50% accurate Predict backward taken: ~65% accurate Predict same as last time: ~80% accurate Pentium: ~85% accurate Pentium Pro: ~92% accurate Best paper designs: ~96% accurate

EECS150 - Digital Design Lecture 9 - CPU Microarchitecture. CMOS Devices

EECS150 - Digital Design Lecture 9 - CPU Microarchitecture. CMOS Devices EECS150 - Digital Design Lecture 9 - CPU Microarchitecture Feb 17, 2009 John Wawrzynek Spring 2009 EECS150 - Lec9-cpu Page 1 CMOS Devices Review: Transistor switch-level models The gate acts like a capacitor.

More information

06 1 MIPS Implementation Pipelined DLX and MIPS Implementations: Hardware, notation, hazards.

06 1 MIPS Implementation Pipelined DLX and MIPS Implementations: Hardware, notation, hazards. 06 1 MIPS Implementation 06 1 Material from Chapter 3 of H&P (for DLX). Material from Chapter 6 of P&H (for MIPS). line: (In this set.) Unpipelined DLX Implementation. (Diagram only.) Pipelined DLX and

More information

Instruction Level Parallelism

Instruction Level Parallelism Instruction Level Parallelism Pipelining, Hazards Appendix C, HPe Outline Pipelining, Hazards Branch prediction Static and Dynamic Scheduling Speculation Compiler techniques, VLIW Limits of ILP. Pipelining

More information

Slide Set 6. for ENCM 369 Winter 2018 Section 01. Steve Norman, PhD, PEng

Slide Set 6. for ENCM 369 Winter 2018 Section 01. Steve Norman, PhD, PEng Slide Set 6 for ENCM 369 Winter 2018 Section 01 Steve Norman, PhD, PEng Electrical & Computer Engineering Schulich School of Engineering University of Calgary February 2018 ENCM 369 Winter 2018 Section

More information

Pipelining. Improve performance by increasing instruction throughput Program execution order. Data access. Instruction. fetch. Data access.

Pipelining. Improve performance by increasing instruction throughput Program execution order. Data access. Instruction. fetch. Data access. Chapter 6 Pipelining Improve performance by increasing instrction throghpt Program eection order Time (in instrctions) lw $, ($) Instrction fetch 2 4 6 8 2 4 6 8 ALU Data access lw $2, 2($) 8 ns Instrction

More information

Contents Slide Set 6. Introduction to Chapter 7 of the textbook. Outline of Slide Set 6. An outline of the first part of Chapter 7

Contents Slide Set 6. Introduction to Chapter 7 of the textbook. Outline of Slide Set 6. An outline of the first part of Chapter 7 CM 69 W4 Section Slide Set 6 slide 2/9 Contents Slide Set 6 for CM 69 Winter 24 Lecture Section Steve Norman, PhD, PEng Electrical & Computer Engineering Schulich School of Engineering University of Calgary

More information

Digital Design and Computer Architecture

Digital Design and Computer Architecture Digital Design and Computer Architecture Lab 0: Multicycle Processor (Part ) Introduction In this lab and the next, you will design and build your own multicycle MIPS processor. You will be much more on

More information

PIPELINING: BRANCH AND MULTICYCLE INSTRUCTIONS

PIPELINING: BRANCH AND MULTICYCLE INSTRUCTIONS PIPELINING: BRANCH AND MULTICYCLE INSTRUCTIONS Mahdi Nazm Bojnordi Assistant Professor School of Computing University of Utah CS/ECE 6810: Computer Architecture Overview Announcement Homework 1 submission

More information

ASIC = Application specific integrated circuit

ASIC = Application specific integrated circuit ASIC = Application specific integrated circuit CS 2630 Computer Organization Meeting 19: Building a MIPS processor Brandon Myers University of Iowa The goal: implement most of MIPS So far Implementing

More information

CpE 442. Designing a Pipeline Processor (lect. II)

CpE 442. Designing a Pipeline Processor (lect. II) CpE 442 Designing a Pipeline Pocesso (lect. II) CPE 442 hazads.1 Otline of Today s Lecte Recap and Intodction (5 mintes) Intodction to Hazads (15 mintes) Fowading (25 mintes) 1 cycle Load Delay (5 mintes)

More information

CS 152 Midterm 2 May 2, 2002 Bob Brodersen

CS 152 Midterm 2 May 2, 2002 Bob Brodersen CS 152 Midterm 2 May 2, 2002 Bob Brodersen Name Solutions Show your work if you want partial credit! Try all the problems, don t get stuck on one of them. Each one is worth 10 points. 1) 2) 3) 4) 5) 6)

More information

Slide Set 8. for ENCM 501 in Winter Term, Steve Norman, PhD, PEng

Slide Set 8. for ENCM 501 in Winter Term, Steve Norman, PhD, PEng Slide Set 8 for ENCM 501 in Winter Term, 2017 Steve Norman, PhD, PEng Electrical & Computer Engineering Schulich School of Engineering University of Calgary Winter Term, 2017 ENCM 501 W17 Lectures: Slide

More information

Chapter 3 Instruction-Level Parallelism and its Exploitation (Part 1)

Chapter 3 Instruction-Level Parallelism and its Exploitation (Part 1) Chapter 3 Instruction-Level Parallelism and its Exploitation (Part 1) ILP vs. Parallel Computers Dynamic Scheduling (Section 3.4, 3.5) Dynamic Branch Prediction (Section 3.3) Hardware Speculation and Precise

More information

Outline. 1 Reiteration. 2 Dynamic scheduling - Tomasulo. 3 Superscalar, VLIW. 4 Speculation. 5 ILP limitations. 6 What we have done so far.

Outline. 1 Reiteration. 2 Dynamic scheduling - Tomasulo. 3 Superscalar, VLIW. 4 Speculation. 5 ILP limitations. 6 What we have done so far. Outline 1 Reiteration Lecture 5: EIT090 Computer Architecture 2 Dynamic scheduling - Tomasulo Anders Ardö 3 Superscalar, VLIW EIT Electrical and Information Technology, Lund University Sept. 30, 2009 4

More information

Fundamentals of Computer Systems

Fundamentals of Computer Systems Fundamentals of Computer Systems A Pipelined MIPS Processor Stephen A. Edwards Columbia University Summer 25 Technical Illustrations Copyright c 27 Elsevier Sequential Laundry Time Alice Bob Cindy Pipelined

More information

Fill-in the following to understand stalling needs and forwarding opportunities

Fill-in the following to understand stalling needs and forwarding opportunities Fill-in the following to understand stalling needs and forwarding opportunities Instruction ADD4 ADD Receiving forwarding help Providing forwarding help Insists on Doesn t mind Doesn t mind Capable of

More information

Digital Design and Computer Architecture

Digital Design and Computer Architecture Digital Design and Computer Architecture Lab 0: Multicycle ARM Processor (Part ) Introduction In this lab and the next, you will design and build your own multicycle ARM processor. You will be much more

More information

Chapter 4 (Part I) The Processor. Baback Izadi Division of Engineering Programs

Chapter 4 (Part I) The Processor. Baback Izadi Division of Engineering Programs EGC442 Introdction to Compter Architectre Chapter 4 (Part I) The Processor Baback Izadi Division of Engineering Programs bai@engr.newpaltz.ed Introdction CPU performance factors Instrction cont Determined

More information

4.5 Pipelining. Pipelining is Natural!

4.5 Pipelining. Pipelining is Natural! 4.5 Pipelining Ovelapped execution of instuctions Instuction level paallelism (concuency) Example pipeline: assembly line ( T Fod) Response time fo any instuction is the same Instuction thoughput inceases

More information

Instruction Level Parallelism and Its. (Part II) ECE 154B

Instruction Level Parallelism and Its. (Part II) ECE 154B Instruction Level Parallelism and Its Exploitation (Part II) ECE 154B Dmitri Strukov ILP techniques not covered last week this week next week Scoreboard Technique Review Allow for out of order execution

More information

Slide Set 9. for ENCM 501 in Winter Steve Norman, PhD, PEng

Slide Set 9. for ENCM 501 in Winter Steve Norman, PhD, PEng Slide Set 9 for ENCM 501 in Winter 2018 Steve Norman, PhD, PEng Electrical & Computer Engineering Schulich School of Engineering University of Calgary March 2018 ENCM 501 Winter 2018 Slide Set 9 slide

More information

Register Transfer Level (RTL) Design Cont.

Register Transfer Level (RTL) Design Cont. CSE4: Components and Design Techniques for Digital Systems Register Transfer Level (RTL) Design Cont. Tajana Simunic Rosing Where we are now What we are covering today: RTL design examples, RTL critical

More information

Lecture 16: Instruction Level Parallelism -- Dynamic Scheduling (OOO) via Tomasulo s Approach

Lecture 16: Instruction Level Parallelism -- Dynamic Scheduling (OOO) via Tomasulo s Approach Lecture 16: Instruction Level Parallelism -- Dynamic Scheduling (OOO) via Tomasulo s Approach CSE 564 Computer Architecture Summer 2017 Department of Computer Science and Engineering Yonghong Yan yan@oakland.edu

More information

A VLIW Processor for Multimedia Applications

A VLIW Processor for Multimedia Applications A VLIW Processor for Multimedia Applications E. Holmann T. Yoshida A. Yamada Y. Shimazu Mitsubishi Electric Corporation, System LSI Laboratory 4-1 Mizuhara, Itami, Hyogo 664, Japan Outline Objective System

More information

Tomasulo Algorithm. Developed at IBM and first implemented in IBM s 360/91

Tomasulo Algorithm. Developed at IBM and first implemented in IBM s 360/91 Tomasulo Algorithm Developed at IBM and first implemented in IBM s 360/91 IBM wanted to use the existing compiler instead of a specialized compiler for high end machines. Tracks when operands are available

More information

CPE300: Digital System Architecture and Design

CPE300: Digital System Architecture and Design CPE300: Digital System Architecture and Design Fall 2011 MW 17:30-18:45 CBC C316 1-Bus Architecture and Datapath 10262011 http://www.egr.unlv.edu/~b1morris/cpe300/ 2 Outline 1-Bus Microarchitecture and

More information

A few questions to test your familiarity of Lab7 at the end of finishing all assigned parts of Lab 7

A few questions to test your familiarity of Lab7 at the end of finishing all assigned parts of Lab 7 EE457 Lab7 Questions page A few questions to test your familiarity of Lab7 at the end of finishing all assigned parts of Lab 7 1. A. In which parts or subparts of Lab 7 does the STALL signal cause the

More information

CSE140L: Components and Design Techniques for Digital Systems Lab. CPU design and PLDs. Tajana Simunic Rosing. Source: Vahid, Katz

CSE140L: Components and Design Techniques for Digital Systems Lab. CPU design and PLDs. Tajana Simunic Rosing. Source: Vahid, Katz CSE140L: Components and Design Techniques for Digital Systems Lab CPU design and PLDs Tajana Simunic Rosing Source: Vahid, Katz 1 Lab #3 due Lab #4 CPU design Today: CPU design - lab overview PLDs Updates

More information

On the Rules of Low-Power Design

On the Rules of Low-Power Design On the Rules of Low-Power Design (and How to Break Them) Prof. Todd Austin Advanced Computer Architecture Lab University of Michigan austin@umich.edu Once upon a time 1 Rules of Low-Power Design P = acv

More information

EEC 581 Computer Architecture. Instruction Level Parallelism (3.4 & 3.5 Dynamic Scheduling)

EEC 581 Computer Architecture. Instruction Level Parallelism (3.4 & 3.5 Dynamic Scheduling) 1 EEC 581 Computer Architecture Instruction Level Parallelism (3.4 & 3.5 Dynamic Scheduling) Chansu Yu Electrical and Computer Engineering Cleveland State University Overview of Chap. 3 (again) Pipelined

More information

Out-of-Order Execution

Out-of-Order Execution 1 Out-of-Order Execution Several implementations out-of-order completion CDC 6600 with scoreboarding IBM 360/91 with Tomasulo s algorithm & reservation stations out-of-order completion leads to: imprecise

More information

Very Short Answer: (1) (1) Peak performance does or does not track observed performance.

Very Short Answer: (1) (1) Peak performance does or does not track observed performance. Very Short Answer: (1) (1) Peak performance does or does not track observed performance. (2) (1) Which is more effective, dynamic or static branch prediction? (3) (1) Do benchmarks remain valid indefinitely?

More information

AN ABSTRACT OF THE THESIS OF

AN ABSTRACT OF THE THESIS OF AN ABSTRACT OF THE THESIS OF Licheng Zhang for the degree of Master of Science in Electrical and Computer Engineering presented on June 7, 1989. Title: The Design of A Reduced Instruction Set Computer

More information

Performance Evolution of 16 Bit Processor in FPGA using State Encoding Techniques

Performance Evolution of 16 Bit Processor in FPGA using State Encoding Techniques Performance Evolution of 16 Bit Processor in FPGA using State Encoding Techniques Madhavi Anupoju 1, M. Sunil Prakash 2 1 M.Tech (VLSI) Student, Department of Electronics & Communication Engineering, MVGR

More information

Advanced Pipelining and Instruction-Level Paralelism (2)

Advanced Pipelining and Instruction-Level Paralelism (2) Advanced Pipelining and Instruction-Level Paralelism (2) Riferimenti bibliografici Computer architecture, a quantitative approach, Hennessy & Patterson: (Morgan Kaufmann eds.) Tomasulo s Algorithm For

More information

A Low-cost, Radiation-Hardened Method for Pipeline Protection in Microprocessors

A Low-cost, Radiation-Hardened Method for Pipeline Protection in Microprocessors 1 A Low-cost, Radiation-Hardened Method for Pipeline Protection in Microprocessors Yang Lin, Mark Zwolinski, Senior Member, IEEE, and Basel Halak Abstract The aggressive scaling of semiconductor technology

More information

CS 110 Computer Architecture. Finite State Machines, Functional Units. Instructor: Sören Schwertfeger.

CS 110 Computer Architecture. Finite State Machines, Functional Units. Instructor: Sören Schwertfeger. CS 110 Computer Architecture Finite State Machines, Functional Units Instructor: Sören Schwertfeger http://shtech.org/courses/ca/ School of Information Science and Technology SIST ShanghaiTech University

More information

Sequential Logic Design CS 64: Computer Organization and Design Logic Lecture #14

Sequential Logic Design CS 64: Computer Organization and Design Logic Lecture #14 Sequential Logic Design CS 64: Computer Organization and Design Logic Lecture #14 Ziad Matni Dept. of Computer Science, UCSB Administrative Only 2.5 weeks left!!!!!!!! OMG!!!!! Th. 5/24 Sequential Logic

More information

Instruction Level Parallelism Part III

Instruction Level Parallelism Part III Course on: Advanced Computer Architectures Instruction Level Parallelism Part III Prof. Cristina Silvano Politecnico di Milano email: cristina.silvano@polimi.it 1 Outline of Part III Dynamic Scheduling

More information

Modeling Digital Systems with Verilog

Modeling Digital Systems with Verilog Modeling Digital Systems with Verilog Prof. Chien-Nan Liu TEL: 03-4227151 ext:34534 Email: jimmy@ee.ncu.edu.tw 6-1 Composition of Digital Systems Most digital systems can be partitioned into two types

More information

Chapter 05: Basic Processing Units Control Unit Design Organization. Lesson 11: Multiple Bus Organisation

Chapter 05: Basic Processing Units Control Unit Design Organization. Lesson 11: Multiple Bus Organisation Chapter 05: Basic Processing Units Control Unit Design Organization Lesson 11: Multiple Bus Organisation Objective Understand multiple bus organisation Learn how the number of independent steps can be

More information

DYNAMIC INSTRUCTION SCHEDULING WITH TOMASULO

DYNAMIC INSTRUCTION SCHEDULING WITH TOMASULO DYNAMIC INSTRUCTION SCHEDULING WITH TOMASULO Slides by: Pedro Tomás Additional reading: Computer Architecture: A Quantitative Approach, 5th edition, Chapter 3, John L. Hennessy and David A. Patterson,

More information

Instruction Level Parallelism Part III

Instruction Level Parallelism Part III Course on: Advanced Computer Architectures Instruction Level Parallelism Part III Prof. Cristina Silvano Politecnico di Milano email: cristina.silvano@polimi.it 1 Outline of Part III Tomasulo Dynamic Scheduling

More information

Dynamic Scheduling. Differences between Tomasulo. Tomasulo Algorithm. CDC 6600 scoreboard. Or ydanicm ceshuldngi

Dynamic Scheduling. Differences between Tomasulo. Tomasulo Algorithm. CDC 6600 scoreboard. Or ydanicm ceshuldngi Dynamic Scheduling (or out-of-order execution) Dynamic Scheduling Or ydanicm ceshuldngi CDC 6600 scoreboard Instruction storage added to each functional execution unit Instructions issue to FU when no

More information

Lab #12: 4-Bit Arithmetic Logic Unit (ALU)

Lab #12: 4-Bit Arithmetic Logic Unit (ALU) Lab #12: 4-Bit Arithmetic Logic Unit (ALU) ECE/COE 0501 Date of Experiment: 4/3/2017 Report Written: 4/5/2017 Submission Date: 4/10/2017 Nicholas Haver nicholas.haver@pitt.edu 1 H a v e r PURPOSE The purpose

More information

Differences between Tomasulo. Another Dynamic Algorithm: Tomasulo Organization. Reservation Station Components

Differences between Tomasulo. Another Dynamic Algorithm: Tomasulo Organization. Reservation Station Components Another Dynamic Algorithm: Tomasulo Algorithm Differences between Tomasulo Algorithm & Scoreboard For IBM 360/9 about 3 years after CDC 6600 Goal: High Performance without special compilers Differences

More information

CS152 Computer Architecture and Engineering Lecture 17 Advanced Pipelining: Tomasulo Algorithm

CS152 Computer Architecture and Engineering Lecture 17 Advanced Pipelining: Tomasulo Algorithm CS152 Computer Architecture and Engineering Lecture 17 Advanced Pipelining: Tomasulo Algorithm 2003-10-23 Dave Patterson (www.cs.berkeley.edu/~patterson) www-inst.eecs.berkeley.edu/~cs152/ CS 152 L17 Adv.

More information

Computer Architecture Spring 2016

Computer Architecture Spring 2016 Computer Architecture Spring 2016 Lecture 12: Dynamic Scheduling: Tomasulo s Algorithm Shuai Wang Department of Computer Science and Technology Nanjing University [Slides adapted from CS252, UC Berkeley

More information

6.3 Sequential Circuits (plus a few Combinational)

6.3 Sequential Circuits (plus a few Combinational) 6.3 Sequential Circuits (plus a few Combinational) Logic Gates: Fundamental Building Blocks Introduction to Computer Science Robert Sedgewick and Kevin Wayne Copyright 2005 http://www.cs.princeton.edu/introcs

More information

CS 250 VLSI System Design

CS 250 VLSI System Design CS 250 VLSI System Design Lecture 3 Timing 2013-9-5 Professor Jonathan Bachrach today s lecture by John Lazzaro TA: Ben Keller www-insteecsberkeleyedu/~cs250/ 1 everything doesn t happen at once Timing,

More information

Multiplexor (aka MUX) An example, yet VERY useful circuit!

Multiplexor (aka MUX) An example, yet VERY useful circuit! Multiplexor (aka MUX) An example, yet VERY useful circuit! A B 0 1 Y S A B Y 0 0 x 0 0 1 x 1 1 x 0 0 1 x 1 1 S=1 S=0 Y = (S)? B:A; Y=S A+SB when S = 0: output A 1: output B 56 A 32-bit MUX Use 32 1-bit

More information

Bubble Razor An Architecture-Independent Approach to Timing-Error Detection and Correction

Bubble Razor An Architecture-Independent Approach to Timing-Error Detection and Correction 1 Bubble Razor An Architecture-Independent Approach to Timing-Error Detection and Correction Matthew Fojtik, David Fick, Yejoong Kim, Nathaniel Pinckney, David Harris, David Blaauw, Dennis Sylvester mfojtik@umich.edu

More information

Structural Fault Tolerance for SOC

Structural Fault Tolerance for SOC Structural Fault Tolerance for SOC Soft Error Fault Tolerant Systems Hrushikesh Chavan Department of ECE, University of Wisconsin Madison, USA hchavan@wisc.edu Younggyun Cho Department of ECE, University

More information

Lab 2: Hardware/Software Co-design with the Wimp51

Lab 2: Hardware/Software Co-design with the Wimp51 Lab 2: Hardware/Software Co-design with the Wimp51 CpE 214: Digital Engineering Lab II Last revised: February 26, 2013 (CAC) Hardware software co-design, now standard in industry, is an approach that brings

More information

Why do we need to debounce the clock input on counter or state machine design? What happens if we don t?

Why do we need to debounce the clock input on counter or state machine design? What happens if we don t? EEL 37 Digital Logic and Computer Systems Test 2 Fall Semester 25. Switch debouncing. (2 pts.) Why do we need to debounce the clock input on counter or state machine design? What happens if we don t? (

More information

Introduction to Computer Engineering. CS/ECE 252, Spring 2017 Rahul Nayar Computer Sciences Department University of Wisconsin Madison

Introduction to Computer Engineering. CS/ECE 252, Spring 2017 Rahul Nayar Computer Sciences Department University of Wisconsin Madison Introduction to Computer Engineering CS/ECE 252, Spring 2017 Rahul Nayar Computer Sciences Department University of Wisconsin Madison Revision Decoder A decoder is a circuit that changes a code into a

More information

Review: What is it? What does it do? slti $4, $5, 6

Review: What is it? What does it do? slti $4, $5, 6 Review: What is it? What does it do? Reg Src Instrction Instrction [3-] I [25-2] I [2-6] I [5 - ] 2 Src Op Reslt em em emtoreg I [5 - ] etend slti $, $5, 6 Reg Src Instrction Instrction [3-] I [25-2] I

More information

Computer and Digital System Architecture

Computer and Digital System Architecture Compter and Digital Sytem Architectre EE/CpE-517-A Brce McNair mcnair@teven.ed Steven Intitte of Technology - All right reerved 4-1/65 Week 4 ARM organization and implementation Frer Ch. 4 Steven Intitte

More information

RAZOR: CIRCUIT-LEVEL CORRECTION OF TIMING ERRORS FOR LOW-POWER OPERATION

RAZOR: CIRCUIT-LEVEL CORRECTION OF TIMING ERRORS FOR LOW-POWER OPERATION RAZOR: CIRCUIT-LEVEL CORRECTION OF TIMING ERRORS FOR LOW-POWER OPERATION Shohaib Aboobacker TU München 22 nd March 2011 Based on Razor: A Low-Power Pipeline Based on Circuit-Level Timing Speculation Dan

More information

CS/ECE 250: Computer Architecture. Basics of Logic Design: ALU, Storage, Tristate. Benjamin Lee

CS/ECE 250: Computer Architecture. Basics of Logic Design: ALU, Storage, Tristate. Benjamin Lee CS/ECE 25: Computer Architecture Basics of Logic esign: ALU, Storage, Tristate Benjamin Lee Slides based on those from Alvin Lebeck, aniel, Andrew Hilton, Amir Roth, Gershon Kedem Homework #3 ue Mar 7,

More information

Logic Analysis Basics

Logic Analysis Basics Logic Analysis Basics September 27, 2006 presented by: Alex Dickson Copyright 2003 Agilent Technologies, Inc. Introduction If you have ever asked yourself these questions: What is a logic analyzer? What

More information

Logic Analysis Basics

Logic Analysis Basics Logic Analysis Basics September 27, 2006 presented by: Alex Dickson Copyright 2003 Agilent Technologies, Inc. Introduction If you have ever asked yourself these questions: What is a logic analyzer? What

More information

Open book/open notes, 90-minutes. Calculators permitted. Do not write on the back side of any pages.

Open book/open notes, 90-minutes. Calculators permitted. Do not write on the back side of any pages. EEL37 Dr. Gugel Spring 26 Exam II Last Name First Open book/open notes, 9-minutes. Calculators permitted. Do not write on the back side of any pages. Page ) points Page 2) 22 points Page 3) 28 points Page

More information

ECE 250 / CPS 250 Computer Architecture. Basics of Logic Design ALU and Storage Elements

ECE 250 / CPS 250 Computer Architecture. Basics of Logic Design ALU and Storage Elements ECE 25 / CPS 25 Computer Architecture Basics of Logic esign ALU and Storage Elements Benjamin Lee Slides based on those from Andrew Hilton (uke), Alvy Lebeck (uke) Benjamin Lee (uke), and Amir Roth (Penn)

More information

Enhancing Performance in Multiple Execution Unit Architecture using Tomasulo Algorithm

Enhancing Performance in Multiple Execution Unit Architecture using Tomasulo Algorithm Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology ISSN 2320 088X IMPACT FACTOR: 6.017 IJCSMC,

More information

DP Tuner 80 Remote Control Software User Manual. Version:08 Issue Date:May 10, 2018

DP Tuner 80 Remote Control Software User Manual. Version:08 Issue Date:May 10, 2018 DP Tuner 80 Remote Control Software User Manual Version:08 Issue Date:May 10, 2018 Copyright Information Copyrights Lumens Digital Optics Inc. All rights reserved. Lumens is a registered trademark of Lumens

More information

VLSI Design: 3) Explain the various MOSFET Capacitances & their significance. 4) Draw a CMOS Inverter. Explain its transfer characteristics

VLSI Design: 3) Explain the various MOSFET Capacitances & their significance. 4) Draw a CMOS Inverter. Explain its transfer characteristics 1) Explain why & how a MOSFET works VLSI Design: 2) Draw Vds-Ids curve for a MOSFET. Now, show how this curve changes (a) with increasing Vgs (b) with increasing transistor width (c) considering Channel

More information

CS3350B Computer Architecture Winter 2015

CS3350B Computer Architecture Winter 2015 CS3350B Computer Architecture Winter 2015 Lecture 5.2: State Circuits: Circuits that Remember Marc Moreno Maza www.csd.uwo.ca/courses/cs3350b [Adapted from lectures on Computer Organization and Design,

More information

Logic Devices for Interfacing, The 8085 MPU Lecture 4

Logic Devices for Interfacing, The 8085 MPU Lecture 4 Logic Devices for Interfacing, The 8085 MPU Lecture 4 1 Logic Devices for Interfacing Tri-State devices Buffer Bidirectional Buffer Decoder Encoder D Flip Flop :Latch and Clocked 2 Tri-state Logic Outputs

More information

More Digital Circuits

More Digital Circuits More Digital Circuits 1 Signals and Waveforms: Showing Time & Grouping 2 Signals and Waveforms: Circuit Delay 2 3 4 5 3 10 0 1 5 13 4 6 3 Sample Debugging Waveform 4 Type of Circuits Synchronous Digital

More information

Sequencing and Control

Sequencing and Control Sequencing and Control Lan-Da Van ( 范倫達 ), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C. Spring, 2016 ldvan@cs.nctu.edu.tw http://www.cs.nctu.edu.tw/~ldvan/ Source:

More information

PROCESSOR BASED TIMING SIGNAL GENERATOR FOR RADAR AND SENSOR APPLICATIONS

PROCESSOR BASED TIMING SIGNAL GENERATOR FOR RADAR AND SENSOR APPLICATIONS PROCESSOR BASED TIMING SIGNAL GENERATOR FOR RADAR AND SENSOR APPLICATIONS Application Note ABSTRACT... 3 KEYWORDS... 3 I. INTRODUCTION... 4 II. TIMING SIGNALS USAGE AND APPLICATION... 5 III. FEATURES AND

More information

T 2 : WR = 0, AD 7 -AD 0 (μp Internal Reg.) T 3 : WR = 1,, M(AB) AD 7 -AD 0 or BDB

T 2 : WR = 0, AD 7 -AD 0 (μp Internal Reg.) T 3 : WR = 1,, M(AB) AD 7 -AD 0 or BDB Lecture-17 Memory WRITE Machine Cycle: It also requires only T 1 to T 3 states. The purpose of memory write machine cycle is to store the contents of any of the 8085A register such as the accumulator into

More information

Read-only memory (ROM) Digital logic: ALUs Sequential logic circuits. Don't cares. Bus

Read-only memory (ROM) Digital logic: ALUs Sequential logic circuits. Don't cares. Bus Digital logic: ALUs Sequential logic circuits CS207, Fall 2004 October 11, 13, and 15, 2004 1 Read-only memory (ROM) A form of memory Contents fixed when circuit is created n input lines for 2 n addressable

More information

Sequential Elements con t Synchronous Digital Systems

Sequential Elements con t Synchronous Digital Systems ecture 15 Computer Science 61C Spring 2017 February 22th, 2017 Sequential Elements con t Synchronous Digital Systems 1 Administrivia I Good news: Waitlist students: You are in! Concurrent Enrollment students:

More information

EE 447/547 VLSI Design. Lecture 9: Sequential Circuits. VLSI Design EE 447/547 Sequential circuits 1

EE 447/547 VLSI Design. Lecture 9: Sequential Circuits. VLSI Design EE 447/547 Sequential circuits 1 EE 447/547 VLSI esign Lecture 9: Sequential Circuits Sequential circuits 1 Outline Floorplanning Sequencing Sequencing Element esign Max and Min-elay Clock Skew Time Borrowing Two-Phase Clocking Sequential

More information

Lecture 10: Sequential Circuits

Lecture 10: Sequential Circuits Introduction to CMOS VLSI esign Lecture 10: Sequential Circuits avid Harris Harvey Mudd College Spring 2004 1 Outline Floorplanning Sequencing Sequencing Element esign Max and Min-elay Clock Skew Time

More information

CMOS VLSI Design. Lab 3: Datapath and Zipper Assembly

CMOS VLSI Design. Lab 3: Datapath and Zipper Assembly Harris CMOS VLSI Design Lab 3: Datapath and Zipper Assembly An n-bit datapath consists of n identical horizontal bitslices 1. Data signals travel horizontally along the bitslice. Control signals run vertically

More information

CS61C : Machine Structures

CS61C : Machine Structures inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture 24 State Circuits : Circuits that Remember Senior Lecturer SOE Dan Garcia www.cs.berkeley.edu/~ddgarcia Bio NAND gate Researchers at Imperial

More information

Low Power VLSI Circuits and Systems Prof. Ajit Pal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

Low Power VLSI Circuits and Systems Prof. Ajit Pal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Low Power VLSI Circuits and Systems Prof. Ajit Pal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture No. # 29 Minimizing Switched Capacitance-III. (Refer

More information

Sequential Logic. Introduction to Computer Yung-Yu Chuang

Sequential Logic. Introduction to Computer Yung-Yu Chuang Sequential Logic Introduction to Computer Yung-Yu Chuang with slides by Sedgewick & Wayne (introcs.cs.princeton.edu), Nisan & Schocken (www.nand2tetris.org) and Harris & Harris (DDCA) Review of Combinational

More information

Computer Architecture Basic Computer Organization and Design

Computer Architecture Basic Computer Organization and Design After the fetch and decode phase, PC contains 31, which is the address of the next instruction in the program (the return address). The register AR holds the effective address 170 [see figure 6.10(a)].

More information

Lecture 2: Digi Logic & Bus

Lecture 2: Digi Logic & Bus Lecture 2 http://www.du.edu/~etuttle/electron/elect36.htm Flip-Flop (kiikku) Sequential Circuits, Bus Online Ch 20.1-3 [Sta10] Ch 3 [Sta10] Circuits with memory What moves on Bus? Flip-Flop S-R Latch PCI-bus

More information

Logic Design ( Part 3) Sequential Logic (Chapter 3)

Logic Design ( Part 3) Sequential Logic (Chapter 3) o Far: Combinational Logic Logic esign ( Part ) equential Logic (Chapter ) Based on slides McGraw-Hill Additional material 24/25/26 Lewis/Martin Additional material 28 oth Additional material 2 Taylor

More information

11. Sequential Elements

11. Sequential Elements 11. Sequential Elements Jacob Abraham Department of Electrical and Computer Engineering The University of Texas at Austin VLSI Design Fall 2017 October 11, 2017 ECE Department, University of Texas at Austin

More information

Spiral Content Mapping. Spiral 2 1. Learning Outcomes DATAPATH COMPONENTS. Datapath Components: Counters Adders Design Example: Crosswalk Controller

Spiral Content Mapping. Spiral 2 1. Learning Outcomes DATAPATH COMPONENTS. Datapath Components: Counters Adders Design Example: Crosswalk Controller -. -. piral Content Mapping piral Theory Combinational Design equential Design ystem Level Design Implementation and Tools Project piral Performance metrics (latency vs. throughput) Boolean Algebra Canonical

More information

Sequential logic circuits

Sequential logic circuits Computer Mathematics Week 10 Sequential logic circuits College of Information Science and Engineering Ritsumeikan University last week combinational digital circuits signals and busses logic gates and,

More information

CSE 140 Exam #3 Solution Tajana Simunic Rosing

CSE 140 Exam #3 Solution Tajana Simunic Rosing CSE 140 Exam #3 Solution Tajana Simunic Rosing Winter 2010 Do not start the exam until you are told to. Turn off any cell phones or pagers. Write your name and PID at the top of every page. Do not separate

More information

Go BEARS~ What are Machine Structures? Lecture #15 Intro to Synchronous Digital Systems, State Elements I C

Go BEARS~ What are Machine Structures? Lecture #15 Intro to Synchronous Digital Systems, State Elements I C CS6C L5 Intro to SDS, State Elements I () inst.eecs.berkeley.edu/~cs6c CS6C : Machine Structures Lecture #5 Intro to Synchronous Digital Systems, State Elements I 28-7-6 Go BEARS~ Albert Chae, Instructor

More information

CS8803: Advanced Digital Design for Embedded Hardware

CS8803: Advanced Digital Design for Embedded Hardware CS883: Advanced Digital Design for Embedded Hardware Lecture 4: Latches, Flip-Flops, and Sequential Circuits Instructor: Sung Kyu Lim (limsk@ece.gatech.edu) Website: http://users.ece.gatech.edu/limsk/course/cs883

More information

Lab #10 Hexadecimal-to-Seven-Segment Decoder, 4-bit Adder-Subtractor and Shift Register. Fall 2017

Lab #10 Hexadecimal-to-Seven-Segment Decoder, 4-bit Adder-Subtractor and Shift Register. Fall 2017 University of Texas at El Paso Electrical and Computer Engineering Department EE 2169 Laboratory for Digital Systems Design I Lab #10 Hexadecimal-to-Seven-Segment Decoder, 4-bit Adder-Subtractor and Shift

More information

Computer Systems Architecture

Computer Systems Architecture Computer Systems Architecture Fundamentals Of Digital Logic 1 Our Goal Understand Fundamentals and basics Concepts How computers work at the lowest level Avoid whenever possible Complexity Implementation

More information

Ryerson University Department of Electrical and Computer Engineering COE/BME 328 Digital Systems

Ryerson University Department of Electrical and Computer Engineering COE/BME 328 Digital Systems 1 P a g e Ryerson University Department of Electrical and Computer Engineering COE/BME 328 Digital Systems Lab 6 35 Marks (3 weeks) Design of a Simple General-Purpose Processor Due Date: Week 12 Objective:

More information

VID_OVERLAY. Digital Video Overlay Module Rev Key Design Features. Block Diagram. Applications. Pin-out Description

VID_OVERLAY. Digital Video Overlay Module Rev Key Design Features. Block Diagram. Applications. Pin-out Description Key Design Features Block Diagram Synthesizable, technology independent VHDL IP Core Video overlays on 24-bit RGB or YCbCr 4:4:4 video Supports all video resolutions up to 2 16 x 2 16 pixels Supports any

More information

Microprocessor Design

Microprocessor Design Microprocessor Design Principles and Practices With VHDL Enoch O. Hwang Brooks / Cole 2004 To my wife and children Windy, Jonathan and Michelle Contents 1. Designing a Microprocessor... 2 1.1 Overview

More information

First Name Last Name November 10, 2009 CS-343 Exam 2

First Name Last Name November 10, 2009 CS-343 Exam 2 CS-343 Exam 2 Instructions: For multiple choice questions, circle the letter of the one best choice unless the question explicitly states that it might have multiple correct answers. There is no penalty

More information

Therefore we need the help of sound editing software to convert the sound source captured from CD into the required format.

Therefore we need the help of sound editing software to convert the sound source captured from CD into the required format. Sound File Format Starting from a sound source file, there are three steps to prepare a voice chip samples. They are: Sound Editing Sound Compile Voice Chip Programming Suppose the sound comes from CD.

More information

Reconfigurable FPGA Implementation of FIR Filter using Modified DA Method

Reconfigurable FPGA Implementation of FIR Filter using Modified DA Method Reconfigurable FPGA Implementation of FIR Filter using Modified DA Method M. Backia Lakshmi 1, D. Sellathambi 2 1 PG Student, Department of Electronics and Communication Engineering, Parisutham Institute

More information

UC Berkeley CS61C : Machine Structures

UC Berkeley CS61C : Machine Structures inst.eecs.berkeley.edu/~cs61c UC Berkeley CS61C : Machine Structures Lecture 21 State Elements : Circuits that Remember 2007-03-07 Mocha sipping TA Valerie Ishida inst.eecs.berkeley.edu/~cs61c-td 161 Exabytes

More information

Amdahl s Law in the Multicore Era

Amdahl s Law in the Multicore Era Amdahl s Law in the Multicore Era Mark D. Hill and Michael R. Marty University of Wisconsin Madison August 2008 @ Semiahmoo Workshop IBM s Dr. Thomas Puzak: Everyone knows Amdahl s Law 2008 Multifacet

More information