Pipelining. Improve performance by increasing instruction throughput Program execution order. Data access. Instruction. fetch. Data access.

Size: px
Start display at page:

Download "Pipelining. Improve performance by increasing instruction throughput Program execution order. Data access. Instruction. fetch. Data access."

Transcription

1 Chapter 6

2 Pipelining Improve performance by increasing instrction throghpt Program eection order Time (in instrctions) lw $, ($) Instrction fetch ALU Data access lw $2, 2($) 8 ns Instrction fetch ALU Data access lw $3, 3($) Program eection Time order (in instrctions) lw $, ($) lw $2, 2($) Instrction fetch 2 ns 8 ns Instrction fetch ALU Data access ALU Data access Instrction fetch 8 ns... lw $3, 3($) 2 ns Instrction fetch ALU 2 ns 2 ns 2 ns 2 ns 2 ns Ideal speedp is nmber of stages in the pipeline. Do we achieve this? Data access Electrical & Compter Engineering THE COLLEGE OF NEW JERSEY 2

3 Pipelining What makes it easy all instrctions are the same length jst a few instrction formats operands appear only in loads and stores What makes it hard? strctral hazards: sppose we had only one control hazards: need to worry abot branch instrctions hazards: an instrction depends on a previos instrction Electrical & Compter Engineering THE COLLEGE OF NEW JERSEY 3

4 Pipelining We ll bild a simple pipeline and look at these isses We ll talk abot modern processors and what really makes it hard: eception handling trying to improve performance with ot-oforder eection, etc. Electrical & Compter Engineering THE COLLEGE OF NEW JERSEY 4

5 Basic Idea IF: Instrction fetch ID: Instrction decode/ register file read EX: Eecte/ address calclation E: emory access : back 4 reslt Shift left 2 PC ress Instrction Instrction register register 2 isters 2 register Zero ALU ALU reslt ress Data 6 Sign etend 32 What do we need to add to actally split the path into stages? Electrical & Compter Engineering THE COLLEGE OF NEW JERSEY 5

6 Pipelined Datapath IF/ID ID/EX EX/E E/ 4 reslt Shift left 2 PC ress Instrction Instrction register register 2 isters 2 register Zero ALU ALU reslt ress Data 6 Sign etend 32 Can yo find a problem even if there are no dependencies? What instrctions can we eecte to manifest the problem? Electrical & Compter Engineering THE COLLEGE OF NEW JERSEY 6

7 Corrected Datapath IF/ID ID/EX EX/E E/ 4 reslt Shift left 2 PC ress Instrction Instrction register register 2 isters 2 register Zero ALU ALU reslt ress Data 6 Sign etend 32 Electrical & Compter Engineering THE COLLEGE OF NEW JERSEY 7

8 Graphically Representing Pipelines Program eection order (in instrctions) lw $, 2($) Time (in clock cycles) CC CC 2 CC 3 CC 4 CC 5 CC 6 I ALU D sb $, $2, $3 I ALU D Can help with answering qestions like: how many cycles does it take to eecte this code? what is the ALU doing dring cycle 4? se this representation to help nderstand paths Electrical & Compter Engineering THE COLLEGE OF NEW JERSEY 8

9 Pipeline Control PCSrc IF/ID ID/EX EX/E E/ 4 Shift left 2 reslt Branch PC ress Instrction Instrction register register 2 isters 2 register Instrction [5 ] 6 Sign 32 etend ALUSrc 6 ALU control Zero ALU ALU reslt ress em Data em emto Instrction [2 6] Instrction [5 ] ALUOp Dst Electrical & Compter Engineering THE COLLEGE OF NEW JERSEY 9

10 Pipeline control We have 5 stages. What needs to be controlled in each stage? Instrction Fetch and PC Increment Instrction Decode / ister Fetch Eection emory Stage Back Electrical & Compter Engineering THE COLLEGE OF NEW JERSEY

11 Pipeline control How wold control be handled in an atomobile plant? a fancy control center telling everyone what to do? shold we se a finite state machine? Electrical & Compter Engineering THE COLLEGE OF NEW JERSEY

12 Pipeline Control Pass control signals along jst like the Eection/ress Calclation stage control lines emory access stage control lines stage control lines Instrction Dst ALU Op ALU Op ALU Src Branch em em write em to R-format lw sw X X beq X X Instrction Control EX IF/ID ID/EX EX/E E/ Electrical & Compter Engineering THE COLLEGE OF NEW JERSEY 2

13 Datapath with Control PCSrc Control ID/EX EX/E E/ IF/ID EX PC 4 ress Instrction Instrction register register 2 isters 2 register Shift left 2 reslt ALUSrc Zero ALU ALU reslt Branch em ress Data emto Instrction 6 32 [5 ] Sign etend 6 ALU control em Instrction [2 6] Instrction [5 ] Dst ALUOp Electrical & Compter Engineering THE COLLEGE OF NEW JERSEY 3

14 Dependencies Problem with starting net instrction before first is finished dependencies that go backward in time are hazards Time (in clock cycles) Vale of register $2: Program eection order (in instrctions) sb $2, $, $3 CC CC 2 CC 3 CC 4 CC 5 CC 6 CC 7 CC 8 CC 9 / I D and $2, $2, $5 I D or $3, $6, $2 I D add $4, $2, $2 I D sw $5, ($2) I D Electrical & Compter Engineering THE COLLEGE OF NEW JERSEY 4

15 Software Soltion Have compiler garantee no hazards Where do we insert the nops? sb $2, $, $3 and $2, $2, $5 or $3, $6, $2 add $4, $2, $2 sw $5, ($2) Problem: this really slows s down! Electrical & Compter Engineering THE COLLEGE OF NEW JERSEY 5

16 Forwarding Use temporary reslts, don t wait for them to be written register file forwarding to handle read/write to same register ALU forwarding Time (in clock cycles) CC CC 2 CC 3 CC 4 CC 5 CC 6 CC 7 CC 8 CC 9 Vale of register $2 : / Vale of EX/E : X X X 2 X X X X X Vale of E/ : X X X X 2 X X X X Program eection order (in instrctions) sb $2, $, $3 I D and $2, $2, $5 I D or $3, $6, $2 I D add $4, $2, $2 I D sw $5, ($2) I D what if this $2 was $3? Electrical & Compter Engineering THE COLLEGE OF NEW JERSEY 6

17 Forwarding ID/EX EX/E Control E/ IF/ID EX PC Instrction Instrction isters ALU Data IF/ID.isterRs Rs IF/ID.isterRt Rt IF/ID.isterRt IF/ID.isterRd Rt Rd EX/E.isterRd Forwarding nit E/.isterRd Electrical & Compter Engineering THE COLLEGE OF NEW JERSEY 7

18 Can't always forward Load word can still case a hazard: an instrction tries to read a register following a load instrction that writes to the same register Program eection order (in instrctions) lw $2, 2($) Time (in clock cycles) CC CC 2 CC 3 CC 4 CC 5 CC 6 I D CC 7 CC 8 CC 9 and $4, $2, $5 I D or $8, $2, $6 I D add $9, $4, $2 I D slt $, $6, $7 I D Ths, we need a hazard detection nit to stall the load instrction Electrical & Compter Engineering THE COLLEGE OF NEW JERSEY 8

19 Stalling We can stall the pipeline by keeping an instrction in the same stage Program Time (in clock cycles) eection order (in instrctions) CC CC 2 CC 3 CC 4 CC 5 CC 6 CC 7 CC 8 CC 9 CC lw $2, 2($) I D and $4, $2, $5 I D or $8, $2, $6 add $9, $4, $2 I I D bbble I D slt $, $6, $7 I D Electrical & Compter Engineering THE COLLEGE OF NEW JERSEY 9

20 Hazard Detection Unit Stall by letting an instrction that won t write anything go forward Hazard detection nit ID/EX.em ID/EX IF/ID IF/ID Control EX EX/E E/ PC PC Instrction Instrction isters ALU Data IF/ID.isterRs IF/ID.isterRt IF/ID.isterRt IF/ID.isterRd ID/EX.isterRt Rt Rd Rs Rt Forwarding nit EX/E.isterRd E/.isterRd Electrical & Compter Engineering THE COLLEGE OF NEW JERSEY 2

21 Branch Hazards When we decide to branch, other instrctions are in the pipeline! Program eection order (in instrctions) Time (in clock cycles) CC CC 2 CC 3 CC 4 CC 5 CC 6 CC 7 CC 8 CC 9 4 beq $, $3, 7 I D 44 and $2, $2, $5 I D 48 or $3, $6, $2 I D 52 add $4, $2, $2 I D 72 lw $4, 5($7) I D We are predicting branch not taken need to add hardware for flshing instrctions if we are wrong Electrical & Compter Engineering THE COLLEGE OF NEW JERSEY 2

22 Flshing Instrctions IF.Flsh Hazard detection nit ID/EX EX/E Control E/ IF/ID EX PC 4 Instrction Shift left 2 isters = ALU Data Sign etend Forwarding nit Electrical & Compter Engineering THE COLLEGE OF NEW JERSEY 22

23 Improving Performance Try and avoid stalls! E.g., reorder these instrctions: lw $t, ($t) lw $t2, 4($t) sw $t2, ($t) sw $t, 4($t) Electrical & Compter Engineering THE COLLEGE OF NEW JERSEY 23

24 Improving Performance a branch delay slot the net instrction after a branch is always eected rely on compiler to fill the slot with something sefl Sperscalar: start more than one instrction in the same cycle Electrical & Compter Engineering THE COLLEGE OF NEW JERSEY 24

25 Dynamic Schedling The hardware performs the schedling hardware tries to find instrctions to eecte ot of order eection is possible speclative eection and dynamic branch prediction Electrical & Compter Engineering THE COLLEGE OF NEW JERSEY 25

26 Dynamic Schedling All modern processors are very complicated DEC Alpha 2264: 9 stage pipeline, 6 instrction isse PowerPC and Pentim: branch history table Compiler technology important Electrical & Compter Engineering THE COLLEGE OF NEW JERSEY 26

27 Dynamic Schedling This class has given yo the backgrond yo need to learn more Video: An Overview of Intel s Pentim Processor Electrical & Compter Engineering THE COLLEGE OF NEW JERSEY 27

Chapter 4 (Part I) The Processor. Baback Izadi Division of Engineering Programs

Chapter 4 (Part I) The Processor. Baback Izadi Division of Engineering Programs EGC442 Introdction to Compter Architectre Chapter 4 (Part I) The Processor Baback Izadi Division of Engineering Programs bai@engr.newpaltz.ed Introdction CPU performance factors Instrction cont Determined

More information

Review: What is it? What does it do? slti $4, $5, 6

Review: What is it? What does it do? slti $4, $5, 6 Review: What is it? What does it do? Reg Src Instrction Instrction [3-] I [25-2] I [2-6] I [5 - ] 2 Src Op Reslt em em emtoreg I [5 - ] etend slti $, $5, 6 Reg Src Instrction Instrction [3-] I [25-2] I

More information

CpE 442. Designing a Pipeline Processor (lect. II)

CpE 442. Designing a Pipeline Processor (lect. II) CpE 442 Designing a Pipeline Pocesso (lect. II) CPE 442 hazads.1 Otline of Today s Lecte Recap and Intodction (5 mintes) Intodction to Hazads (15 mintes) Fowading (25 mintes) 1 cycle Load Delay (5 mintes)

More information

Pipeline design. Mehran Rezaei

Pipeline design. Mehran Rezaei Pipeline design Mehran Rezaei Shift Left 2 pc Opcode ExtOp Cont Unit RegDst Addr Addr2 Addr npcsle Reg ALUSrc Mem 2 OVF Branch ALUCtr MemtoReg Mem Funct Extension ALUOp ALU Cont Shift Left 2 ID EXE MEM

More information

PIPELINING: BRANCH AND MULTICYCLE INSTRUCTIONS

PIPELINING: BRANCH AND MULTICYCLE INSTRUCTIONS PIPELINING: BRANCH AND MULTICYCLE INSTRUCTIONS Mahdi Nazm Bojnordi Assistant Professor School of Computing University of Utah CS/ECE 6810: Computer Architecture Overview Announcement Homework 1 submission

More information

06 1 MIPS Implementation Pipelined DLX and MIPS Implementations: Hardware, notation, hazards.

06 1 MIPS Implementation Pipelined DLX and MIPS Implementations: Hardware, notation, hazards. 06 1 MIPS Implementation 06 1 Material from Chapter 3 of H&P (for DLX). Material from Chapter 6 of P&H (for MIPS). line: (In this set.) Unpipelined DLX Implementation. (Diagram only.) Pipelined DLX and

More information

EECS150 - Digital Design Lecture 9 - CPU Microarchitecture. CMOS Devices

EECS150 - Digital Design Lecture 9 - CPU Microarchitecture. CMOS Devices EECS150 - Digital Design Lecture 9 - CPU Microarchitecture Feb 17, 2009 John Wawrzynek Spring 2009 EECS150 - Lec9-cpu Page 1 CMOS Devices Review: Transistor switch-level models The gate acts like a capacitor.

More information

Instruction Level Parallelism

Instruction Level Parallelism Instruction Level Parallelism Pipelining, Hazards Appendix C, HPe Outline Pipelining, Hazards Branch prediction Static and Dynamic Scheduling Speculation Compiler techniques, VLIW Limits of ILP. Pipelining

More information

Slide Set 6. for ENCM 369 Winter 2018 Section 01. Steve Norman, PhD, PEng

Slide Set 6. for ENCM 369 Winter 2018 Section 01. Steve Norman, PhD, PEng Slide Set 6 for ENCM 369 Winter 2018 Section 01 Steve Norman, PhD, PEng Electrical & Computer Engineering Schulich School of Engineering University of Calgary February 2018 ENCM 369 Winter 2018 Section

More information

Outline. 1 Reiteration. 2 Dynamic scheduling - Tomasulo. 3 Superscalar, VLIW. 4 Speculation. 5 ILP limitations. 6 What we have done so far.

Outline. 1 Reiteration. 2 Dynamic scheduling - Tomasulo. 3 Superscalar, VLIW. 4 Speculation. 5 ILP limitations. 6 What we have done so far. Outline 1 Reiteration Lecture 5: EIT090 Computer Architecture 2 Dynamic scheduling - Tomasulo Anders Ardö 3 Superscalar, VLIW EIT Electrical and Information Technology, Lund University Sept. 30, 2009 4

More information

Contents Slide Set 6. Introduction to Chapter 7 of the textbook. Outline of Slide Set 6. An outline of the first part of Chapter 7

Contents Slide Set 6. Introduction to Chapter 7 of the textbook. Outline of Slide Set 6. An outline of the first part of Chapter 7 CM 69 W4 Section Slide Set 6 slide 2/9 Contents Slide Set 6 for CM 69 Winter 24 Lecture Section Steve Norman, PhD, PEng Electrical & Computer Engineering Schulich School of Engineering University of Calgary

More information

Slide Set 8. for ENCM 501 in Winter Term, Steve Norman, PhD, PEng

Slide Set 8. for ENCM 501 in Winter Term, Steve Norman, PhD, PEng Slide Set 8 for ENCM 501 in Winter Term, 2017 Steve Norman, PhD, PEng Electrical & Computer Engineering Schulich School of Engineering University of Calgary Winter Term, 2017 ENCM 501 W17 Lectures: Slide

More information

CS 152 Midterm 2 May 2, 2002 Bob Brodersen

CS 152 Midterm 2 May 2, 2002 Bob Brodersen CS 152 Midterm 2 May 2, 2002 Bob Brodersen Name Solutions Show your work if you want partial credit! Try all the problems, don t get stuck on one of them. Each one is worth 10 points. 1) 2) 3) 4) 5) 6)

More information

Digital Design and Computer Architecture

Digital Design and Computer Architecture Digital Design and Computer Architecture Lab 0: Multicycle Processor (Part ) Introduction In this lab and the next, you will design and build your own multicycle MIPS processor. You will be much more on

More information

Slide Set 9. for ENCM 501 in Winter Steve Norman, PhD, PEng

Slide Set 9. for ENCM 501 in Winter Steve Norman, PhD, PEng Slide Set 9 for ENCM 501 in Winter 2018 Steve Norman, PhD, PEng Electrical & Computer Engineering Schulich School of Engineering University of Calgary March 2018 ENCM 501 Winter 2018 Slide Set 9 slide

More information

ASIC = Application specific integrated circuit

ASIC = Application specific integrated circuit ASIC = Application specific integrated circuit CS 2630 Computer Organization Meeting 19: Building a MIPS processor Brandon Myers University of Iowa The goal: implement most of MIPS So far Implementing

More information

Instruction Level Parallelism and Its. (Part II) ECE 154B

Instruction Level Parallelism and Its. (Part II) ECE 154B Instruction Level Parallelism and Its Exploitation (Part II) ECE 154B Dmitri Strukov ILP techniques not covered last week this week next week Scoreboard Technique Review Allow for out of order execution

More information

Fundamentals of Computer Systems

Fundamentals of Computer Systems Fundamentals of Computer Systems A Pipelined MIPS Processor Stephen A. Edwards Columbia University Summer 25 Technical Illustrations Copyright c 27 Elsevier Sequential Laundry Time Alice Bob Cindy Pipelined

More information

A few questions to test your familiarity of Lab7 at the end of finishing all assigned parts of Lab 7

A few questions to test your familiarity of Lab7 at the end of finishing all assigned parts of Lab 7 EE457 Lab7 Questions page A few questions to test your familiarity of Lab7 at the end of finishing all assigned parts of Lab 7 1. A. In which parts or subparts of Lab 7 does the STALL signal cause the

More information

Advanced Pipelining and Instruction-Level Paralelism (2)

Advanced Pipelining and Instruction-Level Paralelism (2) Advanced Pipelining and Instruction-Level Paralelism (2) Riferimenti bibliografici Computer architecture, a quantitative approach, Hennessy & Patterson: (Morgan Kaufmann eds.) Tomasulo s Algorithm For

More information

Instruction Level Parallelism Part III

Instruction Level Parallelism Part III Course on: Advanced Computer Architectures Instruction Level Parallelism Part III Prof. Cristina Silvano Politecnico di Milano email: cristina.silvano@polimi.it 1 Outline of Part III Dynamic Scheduling

More information

Fill-in the following to understand stalling needs and forwarding opportunities

Fill-in the following to understand stalling needs and forwarding opportunities Fill-in the following to understand stalling needs and forwarding opportunities Instruction ADD4 ADD Receiving forwarding help Providing forwarding help Insists on Doesn t mind Doesn t mind Capable of

More information

Instruction Level Parallelism Part III

Instruction Level Parallelism Part III Course on: Advanced Computer Architectures Instruction Level Parallelism Part III Prof. Cristina Silvano Politecnico di Milano email: cristina.silvano@polimi.it 1 Outline of Part III Tomasulo Dynamic Scheduling

More information

Out-of-Order Execution

Out-of-Order Execution 1 Out-of-Order Execution Several implementations out-of-order completion CDC 6600 with scoreboarding IBM 360/91 with Tomasulo s algorithm & reservation stations out-of-order completion leads to: imprecise

More information

Chapter 3 Instruction-Level Parallelism and its Exploitation (Part 1)

Chapter 3 Instruction-Level Parallelism and its Exploitation (Part 1) Chapter 3 Instruction-Level Parallelism and its Exploitation (Part 1) ILP vs. Parallel Computers Dynamic Scheduling (Section 3.4, 3.5) Dynamic Branch Prediction (Section 3.3) Hardware Speculation and Precise

More information

Lecture 16: Instruction Level Parallelism -- Dynamic Scheduling (OOO) via Tomasulo s Approach

Lecture 16: Instruction Level Parallelism -- Dynamic Scheduling (OOO) via Tomasulo s Approach Lecture 16: Instruction Level Parallelism -- Dynamic Scheduling (OOO) via Tomasulo s Approach CSE 564 Computer Architecture Summer 2017 Department of Computer Science and Engineering Yonghong Yan yan@oakland.edu

More information

On the Rules of Low-Power Design

On the Rules of Low-Power Design On the Rules of Low-Power Design (and How to Break Them) Prof. Todd Austin Advanced Computer Architecture Lab University of Michigan austin@umich.edu Once upon a time 1 Rules of Low-Power Design P = acv

More information

Tomasulo Algorithm. Developed at IBM and first implemented in IBM s 360/91

Tomasulo Algorithm. Developed at IBM and first implemented in IBM s 360/91 Tomasulo Algorithm Developed at IBM and first implemented in IBM s 360/91 IBM wanted to use the existing compiler instead of a specialized compiler for high end machines. Tracks when operands are available

More information

Very Short Answer: (1) (1) Peak performance does or does not track observed performance.

Very Short Answer: (1) (1) Peak performance does or does not track observed performance. Very Short Answer: (1) (1) Peak performance does or does not track observed performance. (2) (1) Which is more effective, dynamic or static branch prediction? (3) (1) Do benchmarks remain valid indefinitely?

More information

Modeling Digital Systems with Verilog

Modeling Digital Systems with Verilog Modeling Digital Systems with Verilog Prof. Chien-Nan Liu TEL: 03-4227151 ext:34534 Email: jimmy@ee.ncu.edu.tw 6-1 Composition of Digital Systems Most digital systems can be partitioned into two types

More information

Computer Architecture Spring 2016

Computer Architecture Spring 2016 Computer Architecture Spring 2016 Lecture 12: Dynamic Scheduling: Tomasulo s Algorithm Shuai Wang Department of Computer Science and Technology Nanjing University [Slides adapted from CS252, UC Berkeley

More information

CPE300: Digital System Architecture and Design

CPE300: Digital System Architecture and Design CPE300: Digital System Architecture and Design Fall 2011 MW 17:30-18:45 CBC C316 1-Bus Architecture and Datapath 10262011 http://www.egr.unlv.edu/~b1morris/cpe300/ 2 Outline 1-Bus Microarchitecture and

More information

DYNAMIC INSTRUCTION SCHEDULING WITH TOMASULO

DYNAMIC INSTRUCTION SCHEDULING WITH TOMASULO DYNAMIC INSTRUCTION SCHEDULING WITH TOMASULO Slides by: Pedro Tomás Additional reading: Computer Architecture: A Quantitative Approach, 5th edition, Chapter 3, John L. Hennessy and David A. Patterson,

More information

EEC 581 Computer Architecture. Instruction Level Parallelism (3.4 & 3.5 Dynamic Scheduling)

EEC 581 Computer Architecture. Instruction Level Parallelism (3.4 & 3.5 Dynamic Scheduling) 1 EEC 581 Computer Architecture Instruction Level Parallelism (3.4 & 3.5 Dynamic Scheduling) Chansu Yu Electrical and Computer Engineering Cleveland State University Overview of Chap. 3 (again) Pipelined

More information

Computer and Digital System Architecture

Computer and Digital System Architecture Compter and Digital Sytem Architectre EE/CpE-517-A Brce McNair mcnair@teven.ed Steven Intitte of Technology - All right reerved 4-1/65 Week 4 ARM organization and implementation Frer Ch. 4 Steven Intitte

More information

Read-only memory (ROM) Digital logic: ALUs Sequential logic circuits. Don't cares. Bus

Read-only memory (ROM) Digital logic: ALUs Sequential logic circuits. Don't cares. Bus Digital logic: ALUs Sequential logic circuits CS207, Fall 2004 October 11, 13, and 15, 2004 1 Read-only memory (ROM) A form of memory Contents fixed when circuit is created n input lines for 2 n addressable

More information

Performance Evolution of 16 Bit Processor in FPGA using State Encoding Techniques

Performance Evolution of 16 Bit Processor in FPGA using State Encoding Techniques Performance Evolution of 16 Bit Processor in FPGA using State Encoding Techniques Madhavi Anupoju 1, M. Sunil Prakash 2 1 M.Tech (VLSI) Student, Department of Electronics & Communication Engineering, MVGR

More information

Analog Signal Input. ! Note: B.1 Analog Connections. Programming for Analog Channels

Analog Signal Input. ! Note: B.1 Analog Connections. Programming for Analog Channels B Analog Signal Inpt B.1 Analog Connections Refer to the diagram (page B-10) showing the VAN analog boards for connection of analog inpts. Be sre yo follow the indicated positive and negative polarity

More information

CS 110 Computer Architecture. Finite State Machines, Functional Units. Instructor: Sören Schwertfeger.

CS 110 Computer Architecture. Finite State Machines, Functional Units. Instructor: Sören Schwertfeger. CS 110 Computer Architecture Finite State Machines, Functional Units Instructor: Sören Schwertfeger http://shtech.org/courses/ca/ School of Information Science and Technology SIST ShanghaiTech University

More information

Dynamic Scheduling. Differences between Tomasulo. Tomasulo Algorithm. CDC 6600 scoreboard. Or ydanicm ceshuldngi

Dynamic Scheduling. Differences between Tomasulo. Tomasulo Algorithm. CDC 6600 scoreboard. Or ydanicm ceshuldngi Dynamic Scheduling (or out-of-order execution) Dynamic Scheduling Or ydanicm ceshuldngi CDC 6600 scoreboard Instruction storage added to each functional execution unit Instructions issue to FU when no

More information

CS3350B Computer Architecture Winter 2015

CS3350B Computer Architecture Winter 2015 CS3350B Computer Architecture Winter 2015 Lecture 5.2: State Circuits: Circuits that Remember Marc Moreno Maza www.csd.uwo.ca/courses/cs3350b [Adapted from lectures on Computer Organization and Design,

More information

Differences between Tomasulo. Another Dynamic Algorithm: Tomasulo Organization. Reservation Station Components

Differences between Tomasulo. Another Dynamic Algorithm: Tomasulo Organization. Reservation Station Components Another Dynamic Algorithm: Tomasulo Algorithm Differences between Tomasulo Algorithm & Scoreboard For IBM 360/9 about 3 years after CDC 6600 Goal: High Performance without special compilers Differences

More information

CS152 Computer Architecture and Engineering Lecture 17 Advanced Pipelining: Tomasulo Algorithm

CS152 Computer Architecture and Engineering Lecture 17 Advanced Pipelining: Tomasulo Algorithm CS152 Computer Architecture and Engineering Lecture 17 Advanced Pipelining: Tomasulo Algorithm 2003-10-23 Dave Patterson (www.cs.berkeley.edu/~patterson) www-inst.eecs.berkeley.edu/~cs152/ CS 152 L17 Adv.

More information

First Name Last Name November 10, 2009 CS-343 Exam 2

First Name Last Name November 10, 2009 CS-343 Exam 2 CS-343 Exam 2 Instructions: For multiple choice questions, circle the letter of the one best choice unless the question explicitly states that it might have multiple correct answers. There is no penalty

More information

A VLIW Processor for Multimedia Applications

A VLIW Processor for Multimedia Applications A VLIW Processor for Multimedia Applications E. Holmann T. Yoshida A. Yamada Y. Shimazu Mitsubishi Electric Corporation, System LSI Laboratory 4-1 Mizuhara, Itami, Hyogo 664, Japan Outline Objective System

More information

Multiplexor (aka MUX) An example, yet VERY useful circuit!

Multiplexor (aka MUX) An example, yet VERY useful circuit! Multiplexor (aka MUX) An example, yet VERY useful circuit! A B 0 1 Y S A B Y 0 0 x 0 0 1 x 1 1 x 0 0 1 x 1 1 S=1 S=0 Y = (S)? B:A; Y=S A+SB when S = 0: output A 1: output B 56 A 32-bit MUX Use 32 1-bit

More information

MINIMED 640G SYSTEM^ Getting Started. WITH THE MiniMed 640G INSULIN PUMP

MINIMED 640G SYSTEM^ Getting Started. WITH THE MiniMed 640G INSULIN PUMP MINIMED 640G SYSTEM^ Getting Started WITH THE MiniMed 640G INSULIN PUMP let s get started! Table of Contents Section 1: Getting Started... 3 Getting Started with the MiniMed 640G Inslin Pmp...3 1.1 Pmp

More information

AN ABSTRACT OF THE THESIS OF

AN ABSTRACT OF THE THESIS OF AN ABSTRACT OF THE THESIS OF Licheng Zhang for the degree of Master of Science in Electrical and Computer Engineering presented on June 7, 1989. Title: The Design of A Reduced Instruction Set Computer

More information

Go BEARS~ What are Machine Structures? Lecture #15 Intro to Synchronous Digital Systems, State Elements I C

Go BEARS~ What are Machine Structures? Lecture #15 Intro to Synchronous Digital Systems, State Elements I C CS6C L5 Intro to SDS, State Elements I () inst.eecs.berkeley.edu/~cs6c CS6C : Machine Structures Lecture #5 Intro to Synchronous Digital Systems, State Elements I 28-7-6 Go BEARS~ Albert Chae, Instructor

More information

Ryerson University Department of Electrical and Computer Engineering COE/BME 328 Digital Systems

Ryerson University Department of Electrical and Computer Engineering COE/BME 328 Digital Systems 1 P a g e Ryerson University Department of Electrical and Computer Engineering COE/BME 328 Digital Systems Lab 6 35 Marks (3 weeks) Design of a Simple General-Purpose Processor Due Date: Week 12 Objective:

More information

Cast Away on the Letter A

Cast Away on the Letter A Cast Away on the Letter A TEACHER S GUIDE ELA COMMON CORE STANDARDS 4TH GRADE: For 4th Grade: Key Ideas and Details CCSS.ELA-LITERACY.RL.4.2 Determine a theme of a story, drama, or poem from details in

More information

Chapter 05: Basic Processing Units Control Unit Design Organization. Lesson 11: Multiple Bus Organisation

Chapter 05: Basic Processing Units Control Unit Design Organization. Lesson 11: Multiple Bus Organisation Chapter 05: Basic Processing Units Control Unit Design Organization Lesson 11: Multiple Bus Organisation Objective Understand multiple bus organisation Learn how the number of independent steps can be

More information

Advanced Devices. Registers Counters Multiplexers Decoders Adders. CSC258 Lecture Slides Steve Engels, 2006 Slide 1 of 20

Advanced Devices. Registers Counters Multiplexers Decoders Adders. CSC258 Lecture Slides Steve Engels, 2006 Slide 1 of 20 Advanced Devices Using a combination of gates and flip-flops, we can construct more sophisticated logical devices. These devices, while more complex, are still considered fundamental to basic logic design.

More information

With Ease. BETTY WAGNER Associate Trinity College London, Associate Music Australia READING LEDGER LINE NOTES

With Ease. BETTY WAGNER Associate Trinity College London, Associate Music Australia READING LEDGER LINE NOTES READING LEDGER LINE NOTES With Ease f G f o o BETTY WAGNER Associate Trinity College London, Associate Msic Astralia READING LEDGER LINE NOTES A Nova WITH EASE Book Company Page Pblication http://www.msic-with-ease.com

More information

CS61C : Machine Structures

CS61C : Machine Structures inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture 24 State Circuits : Circuits that Remember Senior Lecturer SOE Dan Garcia www.cs.berkeley.edu/~ddgarcia Bio NAND gate Researchers at Imperial

More information

Register Transfer Level (RTL) Design Cont.

Register Transfer Level (RTL) Design Cont. CSE4: Components and Design Techniques for Digital Systems Register Transfer Level (RTL) Design Cont. Tajana Simunic Rosing Where we are now What we are covering today: RTL design examples, RTL critical

More information

4.5 Pipelining. Pipelining is Natural!

4.5 Pipelining. Pipelining is Natural! 4.5 Pipelining Ovelapped execution of instuctions Instuction level paallelism (concuency) Example pipeline: assembly line ( T Fod) Response time fo any instuction is the same Instuction thoughput inceases

More information

Introduction to CMOS VLSI Design (E158) Lab 3: Datapath and Zipper Assembly

Introduction to CMOS VLSI Design (E158) Lab 3: Datapath and Zipper Assembly Harris Introduction to CMOS VLSI Design (E158) Lab 3: Datapath and Zipper Assembly An n-bit datapath consists of n identical horizontal bitslices 1. Data signals travel horizontally along the bitslice.

More information

Computer Architecture Basic Computer Organization and Design

Computer Architecture Basic Computer Organization and Design After the fetch and decode phase, PC contains 31, which is the address of the next instruction in the program (the return address). The register AR holds the effective address 170 [see figure 6.10(a)].

More information

Enhancing Performance in Multiple Execution Unit Architecture using Tomasulo Algorithm

Enhancing Performance in Multiple Execution Unit Architecture using Tomasulo Algorithm Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology ISSN 2320 088X IMPACT FACTOR: 6.017 IJCSMC,

More information

A Buyers Guide to Laser Projection

A Buyers Guide to Laser Projection The Eropean Digital Cinema Form A Byers Gide to Laser Projection AUTUMN 2018 Table of Contents Slides 2-5 Introdctory notes Slides 6-22 1: Technical Considerations Slides 23-31 2. Financial and lifetime

More information

Digital Design and Computer Architecture

Digital Design and Computer Architecture Digital Design and Computer Architecture Lab 0: Multicycle ARM Processor (Part ) Introduction In this lab and the next, you will design and build your own multicycle ARM processor. You will be much more

More information

Scoreboard Limitations!

Scoreboard Limitations! Scoreboard Limitations! No forwarding read from register! Structural hazards stall at issue! WAW hazard stall at issue!! WAR hazard stall at write! Inf3 Computer Architecture - 2015-2016 1 Dynamic Scheduling

More information

Sequential Elements con t Synchronous Digital Systems

Sequential Elements con t Synchronous Digital Systems ecture 15 Computer Science 61C Spring 2017 February 22th, 2017 Sequential Elements con t Synchronous Digital Systems 1 Administrivia I Good news: Waitlist students: You are in! Concurrent Enrollment students:

More information

Low Power VLSI Circuits and Systems Prof. Ajit Pal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

Low Power VLSI Circuits and Systems Prof. Ajit Pal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Low Power VLSI Circuits and Systems Prof. Ajit Pal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture No. # 29 Minimizing Switched Capacitance-III. (Refer

More information

More Digital Circuits

More Digital Circuits More Digital Circuits 1 Signals and Waveforms: Showing Time & Grouping 2 Signals and Waveforms: Circuit Delay 2 3 4 5 3 10 0 1 5 13 4 6 3 Sample Debugging Waveform 4 Type of Circuits Synchronous Digital

More information

Last time, we saw how latches can be used as memory in a circuit

Last time, we saw how latches can be used as memory in a circuit Flip-Flops Last time, we saw how latches can be used as memory in a circuit Latches introduce new problems: We need to know when to enable a latch We also need to quickly disable a latch In other words,

More information

11. Sequential Elements

11. Sequential Elements 11. Sequential Elements Jacob Abraham Department of Electrical and Computer Engineering The University of Texas at Austin VLSI Design Fall 2017 October 11, 2017 ECE Department, University of Texas at Austin

More information

Review C program: foo.c Compiler Assembly program: foo.s Assembler Object(mach lang module): foo.o. Lecture #14

Review C program: foo.c Compiler Assembly program: foo.s Assembler Object(mach lang module): foo.o. Lecture #14 CS61C L14 Introduction to Synchronous Digital Systems (1) inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture #14 Introduction to Synchronous Digital Systems 2007-7-18 Scott Beamer, Instructor

More information

CS61C : Machine Structures

CS61C : Machine Structures inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture #14 Introduction to Synchronous Digital Systems 2007-7-18 Scott Beamer, Instructor CS61C L14 Introduction to Synchronous Digital Systems

More information

CS61C : Machine Structures

CS61C : Machine Structures inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture #21 State Elements: Circuits that Remember 2008-3-14 Scott Beamer, Guest Lecturer www.piday.org 3.14159265358979323 8462643383279502884

More information

Features 1 Harris and other corners

Features 1 Harris and other corners CS 4495 Compter Vision A. Bobick Featres 1: Harris CS 4495 Compter Vision Featres 1 Harris Aaron Bobick School of nteractie Compting CS 4495 Compter Vision A. Bobick Featres 1: Harris Administriia PS 4:

More information

Scoreboard Limitations

Scoreboard Limitations Scoreboard Limitations! No forwarding read from register! Structural hazards stall at issue! WAW hazard stall at issue! WAR hazard stall at write Inf3 Computer Architecture - 2016-2017 1 Dynamic Scheduling

More information

Music Theory Level 2. Name. Period

Music Theory Level 2. Name. Period Msic Theory evel 2 Name Period Table of Contents edger ines Grand Staff Page 3 Page 4 edger ine and Grand Staff Review Page 5 Grand Staff Piano Visal Page 6 Time Signatres Page 79 Theory Review Page Dotted

More information

6.3 Sequential Circuits (plus a few Combinational)

6.3 Sequential Circuits (plus a few Combinational) 6.3 Sequential Circuits (plus a few Combinational) Logic Gates: Fundamental Building Blocks Introduction to Computer Science Robert Sedgewick and Kevin Wayne Copyright 2005 http://www.cs.princeton.edu/introcs

More information

CHAPTER 4: Logic Circuits

CHAPTER 4: Logic Circuits CHAPTER 4: Logic Circuits II. Sequential Circuits Combinational circuits o The outputs depend only on the current input values o It uses only logic gates, decoders, multiplexers, ALUs Sequential circuits

More information

CHAPTER 4: Logic Circuits

CHAPTER 4: Logic Circuits CHAPTER 4: Logic Circuits II. Sequential Circuits Combinational circuits o The outputs depend only on the current input values o It uses only logic gates, decoders, multiplexers, ALUs Sequential circuits

More information

CMOS VLSI Design. Lab 3: Datapath and Zipper Assembly

CMOS VLSI Design. Lab 3: Datapath and Zipper Assembly Harris CMOS VLSI Design Lab 3: Datapath and Zipper Assembly An n-bit datapath consists of n identical horizontal bitslices 1. Data signals travel horizontally along the bitslice. Control signals run vertically

More information

Sequential Logic. Introduction to Computer Yung-Yu Chuang

Sequential Logic. Introduction to Computer Yung-Yu Chuang Sequential Logic Introduction to Computer Yung-Yu Chuang with slides by Sedgewick & Wayne (introcs.cs.princeton.edu), Nisan & Schocken (www.nand2tetris.org) and Harris & Harris (DDCA) Review of Combinational

More information

Structural Fault Tolerance for SOC

Structural Fault Tolerance for SOC Structural Fault Tolerance for SOC Soft Error Fault Tolerant Systems Hrushikesh Chavan Department of ECE, University of Wisconsin Madison, USA hchavan@wisc.edu Younggyun Cho Department of ECE, University

More information

Registers. Unit 12 Registers and Counters. Registers (D Flip-Flop based) Register Transfers (example not out of text) Accumulator Registers

Registers. Unit 12 Registers and Counters. Registers (D Flip-Flop based) Register Transfers (example not out of text) Accumulator Registers Unit 2 Registers and Counters Fundamentals of Logic esign EE2369 Prof. Eric Maconald Fall Semester 23 Registers Groups of flip-flops Can contain data format can be unsigned, 2 s complement and other more

More information

Sequential Logic Design CS 64: Computer Organization and Design Logic Lecture #14

Sequential Logic Design CS 64: Computer Organization and Design Logic Lecture #14 Sequential Logic Design CS 64: Computer Organization and Design Logic Lecture #14 Ziad Matni Dept. of Computer Science, UCSB Administrative Only 2.5 weeks left!!!!!!!! OMG!!!!! Th. 5/24 Sequential Logic

More information

Design and Implementation of Timer, GPIO, and 7-segment Peripherals

Design and Implementation of Timer, GPIO, and 7-segment Peripherals Design and Implementation of Timer, GPIO, and 7-segment Peripherals 1 Module Overview Learn about timers, GPIO and 7-segment display; Design and implement an AHB timer, a GPIO peripheral, and a 7-segment

More information

In 2007, Pew Research conducted a survey to assess Americans knowledge of

In 2007, Pew Research conducted a survey to assess Americans knowledge of CHAPTER 12 Sample Srveys In 2007, Pew Research condcted a srvey to assess Americans knowledge of crrent events. They asked a random sample of 1,502 U.S. adlts 23 factal qestions abot topics crrently in

More information

ECSE-323 Digital System Design. Datapath/Controller Lecture #1

ECSE-323 Digital System Design. Datapath/Controller Lecture #1 1 ECSE-323 Digital System Design Datapath/Controller Lecture #1 2 Synchronous Digital Systems are often designed in a modular hierarchical fashion. The system consists of modular subsystems, each of which

More information

ECE337 Lab 4 Introduction to State Machines in VHDL

ECE337 Lab 4 Introduction to State Machines in VHDL ECE337 Lab Introduction to State Machines in VHDL In this lab you will: Design, code, and test the functionality of the source version of a Moore model state machine of a sliding window average filter.

More information

Objectives. Combinational logics Sequential logics Finite state machine Arithmetic circuits Datapath

Objectives. Combinational logics Sequential logics Finite state machine Arithmetic circuits Datapath Objectives Combinational logics Sequential logics Finite state machine Arithmetic circuits Datapath In the previous chapters we have studied how to develop a specification from a given application, and

More information

A Low-cost, Radiation-Hardened Method for Pipeline Protection in Microprocessors

A Low-cost, Radiation-Hardened Method for Pipeline Protection in Microprocessors 1 A Low-cost, Radiation-Hardened Method for Pipeline Protection in Microprocessors Yang Lin, Mark Zwolinski, Senior Member, IEEE, and Basel Halak Abstract The aggressive scaling of semiconductor technology

More information

Microprocessor Design

Microprocessor Design Microprocessor Design Principles and Practices With VHDL Enoch O. Hwang Brooks / Cole 2004 To my wife and children Windy, Jonathan and Michelle Contents 1. Designing a Microprocessor... 2 1.1 Overview

More information

Montgomery Modular Exponentiation on Reconfigurable Hardware æ

Montgomery Modular Exponentiation on Reconfigurable Hardware æ Montgomery Modlar Exponentiation on Reconfigrable Hardware æ Thomas Blm Worcester Polytechnic Institte ECE Department Worcester, MA 0609-2280, USA tblm@ece.wpi.ed Christof Paar christof@ece.wpi.ed Abstract

More information

CSE140L: Components and Design Techniques for Digital Systems Lab. CPU design and PLDs. Tajana Simunic Rosing. Source: Vahid, Katz

CSE140L: Components and Design Techniques for Digital Systems Lab. CPU design and PLDs. Tajana Simunic Rosing. Source: Vahid, Katz CSE140L: Components and Design Techniques for Digital Systems Lab CPU design and PLDs Tajana Simunic Rosing Source: Vahid, Katz 1 Lab #3 due Lab #4 CPU design Today: CPU design - lab overview PLDs Updates

More information

Lab 2: Hardware/Software Co-design with the Wimp51

Lab 2: Hardware/Software Co-design with the Wimp51 Lab 2: Hardware/Software Co-design with the Wimp51 CpE 214: Digital Engineering Lab II Last revised: February 26, 2013 (CAC) Hardware software co-design, now standard in industry, is an approach that brings

More information

An Overview of FLEET CS-152

An Overview of FLEET CS-152 An Overview of FLEET S-152 FLEET Brainchild of Ivan Sutherland Fleshed out in collaboration with Berkeley graduate students A one-instruction, clockless processor Alternatively: an asynchronous transporttriggered

More information

DC Ultra. Concurrent Timing, Area, Power and Test Optimization. Overview

DC Ultra. Concurrent Timing, Area, Power and Test Optimization. Overview DATASHEET DC Ultra Concurrent Timing, Area, Power and Test Optimization DC Ultra RTL synthesis solution enables users to meet today s design challenges with concurrent optimization of timing, area, power

More information

1. Basic safety information 4 2. Proper use 4

1. Basic safety information 4 2. Proper use 4 307041 01 EN Digital twilight switch LUNA 120 top2 1200100/ 1200200 1. Basic safety information 4 2. Proper se 4 Disposal 4 3. Installation and connection 5 Monting the time switch 5 Connecting the cable

More information

EXHIBITOR S PROSPECTUS

EXHIBITOR S PROSPECTUS EXHIBITOR S PROSPECTUS Annal Conference & Trade Show TORCH Annal Conference & Trade Show April 18-20, 2017 Hyatt Regency Dallas DEADLINE FOR APPLICATION March 27, 2017 President s Message It is my pleasre

More information

UC Berkeley CS61C : Machine Structures

UC Berkeley CS61C : Machine Structures inst.eecs.berkeley.edu/~cs61c UC Berkeley CS61C : Machine Structures Lecture 21 State Elements : Circuits that Remember 2007-03-07 Mocha sipping TA Valerie Ishida inst.eecs.berkeley.edu/~cs61c-td 161 Exabytes

More information

Risk Risk Title Severity (1-10) Probability (0-100%) I FPGA Area II Timing III Input Distortion IV Synchronization 9 60

Risk Risk Title Severity (1-10) Probability (0-100%) I FPGA Area II Timing III Input Distortion IV Synchronization 9 60 Project Planning Introduction In this section, the plans required for completing the project from start to finish are described. The risk analysis section of this project plan will describe the potential

More information

Introduction to Computer Engineering. CS/ECE 252, Spring 2017 Rahul Nayar Computer Sciences Department University of Wisconsin Madison

Introduction to Computer Engineering. CS/ECE 252, Spring 2017 Rahul Nayar Computer Sciences Department University of Wisconsin Madison Introduction to Computer Engineering CS/ECE 252, Spring 2017 Rahul Nayar Computer Sciences Department University of Wisconsin Madison Revision Decoder A decoder is a circuit that changes a code into a

More information

ECE 250 / CPS 250 Computer Architecture. Basics of Logic Design ALU and Storage Elements

ECE 250 / CPS 250 Computer Architecture. Basics of Logic Design ALU and Storage Elements ECE 25 / CPS 25 Computer Architecture Basics of Logic esign ALU and Storage Elements Benjamin Lee Slides based on those from Andrew Hilton (uke), Alvy Lebeck (uke) Benjamin Lee (uke), and Amir Roth (Penn)

More information