Chapter 4 (Part I) The Processor. Baback Izadi Division of Engineering Programs

Similar documents
Review: What is it? What does it do? slti $4, $5, 6

Pipelining. Improve performance by increasing instruction throughput Program execution order. Data access. Instruction. fetch. Data access.

EECS150 - Digital Design Lecture 9 - CPU Microarchitecture. CMOS Devices

Contents Slide Set 6. Introduction to Chapter 7 of the textbook. Outline of Slide Set 6. An outline of the first part of Chapter 7

Slide Set 6. for ENCM 369 Winter 2018 Section 01. Steve Norman, PhD, PEng

CpE 442. Designing a Pipeline Processor (lect. II)

Computer and Digital System Architecture

Pipeline design. Mehran Rezaei

Performance Evolution of 16 Bit Processor in FPGA using State Encoding Techniques

CS 152 Midterm 2 May 2, 2002 Bob Brodersen

CS/ECE 250: Computer Architecture. Basics of Logic Design: ALU, Storage, Tristate. Benjamin Lee

Modeling Digital Systems with Verilog

CPE300: Digital System Architecture and Design

ASIC = Application specific integrated circuit

A Parallel Multilevel-Huffman Decompression Scheme for IP Cores with Multiple Scan Chains

Digital Design and Computer Architecture

UC Berkeley CS61C : Machine Structures

CS3350B Computer Architecture Winter 2015

ECE 250 / CPS 250 Computer Architecture. Basics of Logic Design ALU and Storage Elements

6.3 Sequential Circuits (plus a few Combinational)

Introduction to Computer Engineering. CS/ECE 252, Spring 2017 Rahul Nayar Computer Sciences Department University of Wisconsin Madison

Multiplexor (aka MUX) An example, yet VERY useful circuit!

CS 110 Computer Architecture. Finite State Machines, Functional Units. Instructor: Sören Schwertfeger.

Analog Signal Input. ! Note: B.1 Analog Connections. Programming for Analog Channels

CS61C : Machine Structures

1. Basic safety information 4 2. Proper use 4

Montgomery Modular Exponentiation on Reconfigurable Hardware æ

Fundamentals of Computer Systems

Advanced Devices. Registers Counters Multiplexers Decoders Adders. CSC258 Lecture Slides Steve Engels, 2006 Slide 1 of 20

Registers. Unit 12 Registers and Counters. Registers (D Flip-Flop based) Register Transfers (example not out of text) Accumulator Registers

06 1 MIPS Implementation Pipelined DLX and MIPS Implementations: Hardware, notation, hazards.

1. Basic safety information. 2. Proper use. 3. Installation and connection. Time switch installation. Disposal. click. Time switch.

Logic Devices for Interfacing, The 8085 MPU Lecture 4

Instruction Level Parallelism

CSE Latches and Flip-flops Dr. Izadi. NOR gate property: A B Z Cross coupled NOR gates: S M S R Q M

CS61C : Machine Structures

Counters

Chapter 3 Unit Combinational

Go BEARS~ What are Machine Structures? Lecture #15 Intro to Synchronous Digital Systems, State Elements I C

IT T35 Digital system desigm y - ii /s - iii

MINIMED 640G SYSTEM^ Getting Started. WITH THE MiniMed 640G INSULIN PUMP

Lab #10 Hexadecimal-to-Seven-Segment Decoder, 4-bit Adder-Subtractor and Shift Register. Fall 2017

1. Basic safety information 4 2. Proper use 4

CS 261 Fall Mike Lam, Professor. Sequential Circuits

CSE140L: Components and Design Techniques for Digital Systems Lab. CPU design and PLDs. Tajana Simunic Rosing. Source: Vahid, Katz

Logic Design II (17.342) Spring Lecture Outline

Asynchronous (Ripple) Counters

BUSES IN COMPUTER ARCHITECTURE

CHAPTER 4 RESULTS & DISCUSSION

Computer Architecture Basic Computer Organization and Design

Midterm Exam 15 points total. March 28, 2011

Sequencing and Control

Logic Design Viva Question Bank Compiled By Channveer Patil

Sequential Logic Design CS 64: Computer Organization and Design Logic Lecture #14

CHAPTER 4: Logic Circuits

Computer Systems Architecture

Chapter 2. Digital Circuits

Sequential Elements con t Synchronous Digital Systems

Register Transfer Level (RTL) Design Cont.

Logic Design. Flip Flops, Registers and Counters

University of California at Berkeley College of Engineering Department of Electrical Engineering and Computer Science SOLUTIONS

Digital Electronics II 2016 Imperial College London Page 1 of 8

Review C program: foo.c Compiler Assembly program: foo.s Assembler Object(mach lang module): foo.o. Lecture #14

CS61C : Machine Structures

Contents Circuits... 1

Registers & Counters. Logic and Digital System Design - CS 303 Erkay Savaş Sabanci University

Flip-Flops and Sequential Circuit Design

Outline. 1 Reiteration. 2 Dynamic scheduling - Tomasulo. 3 Superscalar, VLIW. 4 Speculation. 5 ILP limitations. 6 What we have done so far.

CS8803: Advanced Digital Design for Embedded Hardware

ELEN Electronique numérique

Slide Set 8. for ENCM 501 in Winter Term, Steve Norman, PhD, PEng

More Digital Circuits

Differences between Tomasulo. Another Dynamic Algorithm: Tomasulo Organization. Reservation Station Components

A few questions to test your familiarity of Lab7 at the end of finishing all assigned parts of Lab 7

Chapter 05: Basic Processing Units Control Unit Design Organization. Lesson 11: Multiple Bus Organisation

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

Sequential logic circuits

Tomasulo Algorithm. Developed at IBM and first implemented in IBM s 360/91

PHYSICS 5620 LAB 9 Basic Digital Circuits and Flip-Flops

CS61C : Machine Structures

Advanced Pipelining and Instruction-Level Paralelism (2)

1. Convert the decimal number to binary, octal, and hexadecimal.

Field Communication FXA 675 Rackbus RS-485 Interface monorack II RS-485

BCN1043. By Dr. Mritha Ramalingam. Faculty of Computer Systems & Software Engineering

NH 67, Karur Trichy Highways, Puliyur C.F, Karur District UNIT-III SEQUENTIAL CIRCUITS

Read-only memory (ROM) Digital logic: ALUs Sequential logic circuits. Don't cares. Bus

CHAPTER 4: Logic Circuits

Slide Set 9. for ENCM 501 in Winter Steve Norman, PhD, PEng

¾Strip cable to 8 mm (max. 9) ¾Insert cable in the open DuoFix plug-in terminal at 45. LL2 cables per terminal position possible

Flip Flop. S-R Flip Flop. Sequential Circuits. Block diagram. Prepared by:- Anwar Bari

Registers, Register Transfers and Counters Dr. Fethullah Karabiber

E-Vision Laser 4K Series High Brightness Digital Video Projector

PROCESSOR BASED TIMING SIGNAL GENERATOR FOR RADAR AND SENSOR APPLICATIONS

A Buyers Guide to Laser Projection

EECS150 - Digital Design Lecture 3 - Timing

Microprocessor Design

Asynchronous IC Interconnect Network Design and Implementation Using a Standard ASIC Flow

MC9211 Computer Organization

Instruction Level Parallelism and Its. (Part II) ECE 154B

CS8803: Advanced Digital Design for Embedded Hardware

Transcription:

EGC442 Introdction to Compter Architectre Chapter 4 (Part I) The Processor Baback Izadi Division of Engineering Programs bai@engr.newpaltz.ed Introdction CPU performance factors Instrction cont Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will eamine two IPS implementations A simplified version A more realistic pipelined version Simple sbset, shows most aspects emory reference: lw, sw Arithmetic/logical: add, sb, and, or, slt Control transfer: beq, j 4. Introdction 2 Division of Engineering Programs,

EGC442 Introdction to Compter Architectre Abstract / Simplified View Register # PC Address Instrction Registers Register # Instrction Register # Address Two types of fnctional nits: Elements that operate on vales (combinational) Elements that contain state (seqential) Registers and 3 EGC432 SUNY 3 New Paltz Instrction Eection PC instrction, fetch instrction Register nmbers register file, read registers Depending on instrction class Use to calclate Register # Address Registers Arithmetic reslt Register # Instrction emory address for load/store Register # Branch target address Access for load/store PC target address or PC + 4 PC Instrction Address 4 Division of Engineering Programs, 2

EGC442 Introdction to Compter Architectre ore Detailed CPU Overview ltipleers Can t jst join wires together Use mltipleers Division of Engineering Programs, 3

EGC442 Introdction to Compter Architectre Control Logic Design Basics Information encoded in binary Low voltage =, High voltage = One wire per bit lti-bit encoded on mlti-wire bses Combinational element Operate on Otpt is a fnction of inpt State (seqential) elements Store information 8 Division of Engineering Programs, 4

EGC442 Introdction to Compter Architectre Combinational Elements AND-gate Y = A & B Adder Y = A + B A B + Y A B I I S Y ltipleer Y = S? I : I Y Arithmetic/Logic Unit Y = F(A, B) 9 A B F Y Seqential Elements Register: stores in a circit Uses a clock signal to determine when to pdate the stored vale Edge-triggered: pdate when Clk changes from to D Clk Q Clk D Q Division of Engineering Programs, 5

EGC442 Introdction to Compter Architectre Seqential Elements Register with write control Only pdates on clock edge when write control inpt is Used when stored vale is reqired later Clk D Clk Q D Q Clocking ethodology Combinational logic transforms dring clock cycles Between clock edges Inpt from state elements, otpt to state element Longest delay determines clock period 2 Division of Engineering Programs, 6

EGC442 Introdction to Compter Architectre Bilding a path path Elements that process and addresses in the CPU Registers, s, m s, memories, We will bild a IPS path incrementally Refining the overview design 4.3 Bilding a path 3 Abstraction ake sre yo nderstand the abstractions! Sometimes it is easy to think yo do, when yo don t Select A 32 B 32 Select 32 C A3 B3 A3 B3. C3 C3. A B C 4 Division of Engineering Programs, 7

EGC442 Introdction to Compter Architectre Register File operation sing D flip-flops and UX s register nmber Register register nmber register nmber 2 register Register file 2 register nmber2 Register... Register n 2 Register n 2 What is the fnction of above? 5 Register File and register nmber register nmber register nmber 2 register Register Register file 2 Register n-to- decoder n n C Register D C Register D register nmber 2 Register Register n Register n C Register n D 2 C Register n D How many registers can we read and write at the same time? Does this spport IPS instrctions reqirements? 6 Division of Engineering Programs, 8

EGC442 Introdction to Compter Architectre Bilding the path Inclde the fnctional nits we need for each instrction Instrction address em Instrction Instrction PC Add Sm Address 6 Sign 32 etend Register nmbers a. Instrction b. Program conter 5 3 register 5 5 register 2 Registers register 2 Reg c. Adder control Zero reslt a. Registers b. em a. nit Control: AND OR add sbtract set-on-less-than Use mltipleors to stitch them together b. Sign-etension nit 7 Instrction Fetch 32-bit register Increment by 4 for net instrction Division of Engineering Programs, 9

EGC442 Introdction to Compter Architectre R-Format Instrctions two register operands Perform arithmetic/logical operation register reslt Register nmbers 5 3 register 5 5 register 2 Registers register 2 control Zero reslt Reg a. Registers b. 9 R-Format Instrctions two register operands Perform arithmetic/logical operation register reslt Instrction re gister re gister 2 Registers re gister 2 Reg 3 control Zero reslt 2 Division of Engineering Programs,

EGC442 Introdction to Compter Architectre Load/Store Instrctions lw $t, offset_vale($t2) or sw $t, offset_vale ($t2) register operands Calclate address sing 6-bit offset Use, bt sign-etend offset Load: and pdate register Store: register vale to 2 path for lw and sw Instrctions lw $t, offset_vale($t2) or Compte address Sign etend 6 bit to 32 bit sw $t, offset_vale ($t2) Instrction register register 2 register Reg Registers da ta 2 6 32 Sign etend 3 operation em Zero Rea reslt Address da ta em 22 Division of Engineering Programs,

EGC442 Introdction to Compter Architectre Branch Instrctions PC + 4 Net instrction address beq $t, $t2, offset register operands Compare operands Instrction Use to compare registers, Registers register register 2 register Reg 6 32 Add reslt Sign etend Use to affect Z flag If not eqal, net PC <= PC +4 (Already calclated by instrction fetch) If eqal, Calclate target address Sign-etend displacement Shift left 2 places (word displacement) Add to PC + 4 23 2 Shift left 2 3 operation Zero Branch control Branch Instrctions Jst re-rotes wires Sign-bit wire replicated Division of Engineering Programs, 2

EGC442 Introdction to Compter Architectre Composing the Elements First-ct path does an instrction in one clock cycle Each path element can only do one fnction at a time Hence, we need separate instrction and memories Use mltipleers where alternate sorces are sed for different instrctions 25 R-Type/Load/Store path Division of Engineering Programs, 3

EGC442 Introdction to Compter Architectre Fll path Three Instrction Classes R-Type Instrction op rs rt rd shamt fnct rd: destination Load and Store Instrctions op rs rt 6 bit offset rt: destination Branch Instrction op rs rt 6 bit offset rt: destination 28 Division of Engineering Programs, 4

EGC442 Introdction to Compter Architectre Completed Path PCSrc 4 Add Shift left 2 Add reslt Reg PC address Instrction Instrction [3 ] Instrction [25 2] Instrction [2 6] Instrction [5 ] Instrction [5 ] RegDst register register 2 Registers 2 register 6 32 Sign etend Src Zero reslt Control Address em em emtoreg 29 The ain Control Unit Control signals derived from instrction R-type Load/ Store Branch rs rt rd shamt fnct 3:26 25:2 2:6 5: :6 5: 35 or 43 rs rt address 3:26 25:2 2:6 5: 4 rs rt address 3:26 25:2 2:6 5: opcode always read read, ecept for load write for R-type and load sign-etend and add 3 Division of Engineering Programs, 5

EGC442 Introdction to Compter Architectre path With Control R-Type Instrction Division of Engineering Programs, 6

EGC442 Introdction to Compter Architectre Load Instrction Branch-on-Eqal Instrction Division of Engineering Programs, 7

EGC442 Introdction to Compter Architectre Control Using the op-code from the instrction, the control isses signals to: Selecting the operations to perform (, read/write, etc.) Controlling the flow of (mltipleer inpts) 's operation based on instrction type and fnction code Eample: what shold the do with the instrction add $8, $7, $8 op rs rt rd shamt fnct 35 Control Eample: what shold the do with the instrction lw $, ($2) 35 2 op rs rt 6 bit offset Why is the code for sbtract is and not? 36 Division of Engineering Programs, 8

EGC442 Introdction to Compter Architectre Control st describe hardware to compte 3-bit control inpt Given instrction type = lw, sw Op = beq, compted from instrction type = arithmetic Fnction code for arithmetic Op Fnct field Operation Op Op F5 F4 F3 F2 F F X X X X X X add X X X X X X sb (X) X X add (X) X X sb (X) X X and (X) X X or (X) X X slt Control: AND OR add sbtract set-on-less-than 37 Control Inpts Otpts Op Fnct field Operation Op Op F5 F4 F3 F2 F F 2 X X X X X X add X X X X X X sb X X add X X sb X X and X X or X X slt li-level decoding can redce size of control nit and increase its speed. Op control block Op Op F3 F2 Operation2 Operation Operation F(5 ) F Operation F 38 Division of Engineering Programs, 9

EGC442 Introdction to Compter Architectre A Simple Control PCSrc Add reslt Add Shift left 2 4 Reg PC address Instrction [3 ] Instrction Instrction [25 2] Instrction [2 6] Instrction [5 ] RegDst register register 2 Registers 2 register Src Zero reslt em Address emtoreg Instrction [5 ] 6 32 Sign etend em control Instrction [5 ] op 39 Control Add reslt 4 Add RegDst Branch Shift left 2 em Instrction [3 26] emtoreg Control Op em Src Reg PC address Instrction [3 ] Instrction Instrction [25 2] Instrction [2 6] Instrction [5 ] register register 2 Registers register 2 Zero reslt Address Instrction [5 ] 6 32 Sign etend control Instrction [5 ] emto- Reg em em Instrction Opcode RegDst Src Reg Branch OpOp R-format lw sw X X beq X X 4 Division of Engineering Programs, 2

EGC442 Introdction to Compter Architectre Single Cycle Control Simple combinational logic Op control block Op Op Inpts Op5 Op4 Op3 Op2 Op Op F3 Operation2 Operation Otpts F2 Operation R-format Iw sw beq F(5 ) RegDst F Operation Src F emtoreg Reg em em Branch Op OpO 4 Or Simple Control Strctre All of the logic is combinational We wait for everything to settle down, and the right thing to be done might not prodce right answer right away we se write signals along with clock to determine when to write Cycle time determined by length of the longest path Content of PC Combinational logic Content of PC Clock 42 Division of Engineering Programs, 2

EGC442 Introdction to Compter Architectre Single Cycle Implementation Calclate cycle time assming negligible delays ecept: (2ps), and adders (2ps), register file access (ps) PCSrc Add 4 Reg Shift left 2 Add reslt PC address Instrction [3 ] Instrction Instrction [25 2] Instrction [2 6] Instrction [5 ] RegDst Instrction[5 ] register register 2 register 2 Registers 6 Sign 32 etend Src control Zero reslt em Address em emtoreg Instrction [5 ] Op 43 Single Cycle Implementation Calclate cycle time assming negligible delays ecept: (2ns), and adders (2ns), register file access (ns) PCSrc 4 Add Reg Shift left 2 Add res lt PC address In s tr ction [3 ] Instrction In strctio n [25 2 ] In strctio n [2 6] In strctio n [5 ] RegDst In strctio n [5 ] register register 2 register d ata d ata 2 Registers 6 Sign 32 etend Src control Zero reslt em Address em emtoreg Instr ction [5 ] Op Instrction Instr. emory Register Op. emory Reg. Total R-format 2ps ps 2ps ps 6ps lw 2ps ps 2ps 2ps ps 8ps sw 2ps ps 2ps 2ps 7ps beq 2ps ps 2ps 5ps 44 Division of Engineering Programs, 22

EGC442 Introdction to Compter Architectre Implementing Jmps Jmp 2 address 3:26 25: Jmp ses word address Update PC with concatenation of Top 4 bits of old PC 26-bit jmp address Need an etra control signal decoded from opcode 45 path With Jmps Added Division of Engineering Programs, 23

EGC442 Introdction to Compter Architectre Performance Isses Longest delay determines clock period Critical path: load instrction Instrction register file register file Not feasible to vary period for different instrctions Violates design principle aking the common case fast We will improve performance by pipelining 47 Division of Engineering Programs, 24