DYNAMIC INSTRUCTION SCHEDULING WITH TOMASULO

Size: px
Start display at page:

Download "DYNAMIC INSTRUCTION SCHEDULING WITH TOMASULO"

Transcription

1 DYNAMIC INSTRUCTION SCHEDULING WITH TOMASULO Slides by: Pedro Tomás Additional reading: Computer Architecture: A Quantitative Approach, 5th edition, Chapter 3, John L. Hennessy and David A. Patterson, Morgan Kaufmann, 2011 ADVANCED COMPUTER ARCHITECTURES ARQUITECTURAS AVANÇADAS DE COMPUTADORES (AAC)

2 Outline 2 Dynamic instruction scheduling: Revision of Scoreboard Tomasulo algorithm Example execution using Tomasulo s algorithm

3 Dynamic scheduling Scoreboard revision 3 Divide the ID/OF stage in two parts: ISSUE Instruction decoding and verification of structural and WAW hazards Once all structural and WAW conflicts are solved, issue the instruction READ OPERANDS (Dispatch) Wait until all data hazards are solved, to read them from the register file and to dispatch the instruction to execution Scoreboard DISP. IF Stage ISSUE Stage Ready Ready Ready EX/MEM Stage WB Stage IN ORDER OUT-OF-ORDER

4 4 Scoreboard revision Instruction Status L.D F6,34(R2) L.D F2,45(R3) MUL.D F0,F2,F4 SUB.D F8,F6,F2 DIV.D F10,F0,F6 ADD.D F6,F8,F2 Completed in cycle? Issue Disp. EX WB Issue stage: Issue the next instruction if no WAW or structural hazard is found: WAW Hazard if destination register is already going to be written by an instruction Structural hazard if the FU is already busy FU Status DR SA SB MULT1 MULT2 ADD DIV Gen by FU? Data Ready? Busy Op Fi Fj Fk Qj Qk Rj Rk Fill the correct row if no hazard is found Register Results Status FU F0 F2 F4 F6 F8 F10 F12... F30 Assign the FU that will write to the register

5 5 Scoreboard revision Instruction Status L.D F6,34(R2) L.D F2,45(R3) MUL.D F0,F2,F4 SUB.D F8,F6,F2 DIV.D F10,F0,F6 ADD.D F6,F8,F2 Completed in cycle? Issue Disp. EX WB Dispatch stage: Dispatch all instructions that have valid operands to execution FU Status DR SA SB Gen by FU? Data Ready? Busy Op Fi Fj Fk Qj Qk Rj Rk MULT1 YES YES MULT2 ADD NO YES DIV Dispatch and set Rj,Rk to Don t dispatch Register Results Status FU F0 F2 F4 F6 F8 F10 F12... F30

6 6 Scoreboard revision Instruction Status L.D F6,34(R2) L.D F2,45(R3) MUL.D F0,F2,F4 SUB.D F8,F6,F2 DIV.D F10,F0,F6 ADD.D F6,F8,F2 Completed in cycle? Issue Disp. EX WB Execute stage: Wait for the instruction to complete execution and inform the scoreboard on finish FU Status DR SA SB MULT1 MULT2 ADD DIV Gen by FU? Data Ready? Busy Op Fi Fj Fk Qj Qk Rj Rk Register Results Status FU F0 F2 F4 F6 F8 F10 F12... F30

7 7 Scoreboard revision Instruction Status L.D F6,34(R2) L.D F2,45(R3) MUL.D F0,F2,F4 SUB.D F8,F6,F2 DIV.D F10,F0,F6 ADD.D F6,F8,F2 Completed in cycle? Issue Disp. EX WB Write back stage: Write the result to the destination register if no WAR hazard is found WAR hazard if an instruction still requires the value on the register; this happens if a preceding instruction is stuck on the dispatch stage waiting for some other value FU Status DR SA SB MULT1 MULT2 ADD DIV Gen by FU? Data Ready? Busy Op Fi Fj Fk Qj Qk Rj Rk Clear the slot Register Results Status FU F0 F2 F4 F6 F8 F10 F12... F30 Set the register value as valid on write

8 8 Scoreboard update example (Completed) (Completed) (Executing) (Completed) (At Dispatch) (Ending EX) Instruction Status Completed in cycle? Issue Disp. EX WB L.D F6,34(R2) L.D F2,45(R3) MUL.D F0,F2,F SUB.D F8,F6,F DIV.D F10,F0,F ADD.D F6,F8,F CYCLE 18: The ADD.D has finished executing, but will stall on cycle 19 because of DIV.D: DIV.D precedes ADD.D DIV.D was stalled at dispatch stage because of a RAW on the value of F0 DIV.D reads both operands at the same time FU Status DR SA SB MULT1 MULT2 ADD DIV Gen by FU? Data Ready? Busy Op Fi Fj Fk Qj Qk Rj Rk Register Results Status FU F0 F2 F4 F6 F8 F10 F12... F30

9 9 Tomasulo algorithm Proposed by Robert Tomasulo in 1966: Initially proposed to overcome the long latencies in both memory accesses and floating point operations First implemented on the IBM 360/91 The algorithm revealed to be far more powerful than anticipated being used in almost all modern superscalar processors

10 Tomasulo s algorithm General idea 10 Instead of centralizing the control in a scoreboard, distribute it amongst the different components: Instructions no longer wait on a dispatch stage, instead they are issued directly to reservation stations associated with functional units Once instructions are issued the values are directly copied to the reservation station (works as a form of register renaming) If the instruction operands are not available, store which instruction generates the result (given by the reservation station holding the instruction) Reservation Stations for FU1 FU 1 (e.g., ALU) IF ISSUE Common Data Bus (CDB) FU 2 (e.g., LD/ST) Reservation Stations for FU2 When busy, the reservation stations hold instructions An instruction can be identified by the reservation station where it is being held

11 Tomasulo s algorithm General idea 11 Instruction issue stalls if all reservation stations for the given operation are busy Functional units (FUs) can be pipelined and may have different number of reservation stations All units write to a CDB which forwards the results to the reservation station and the RF IF ISSUE Register File S1 S2 S3 S4 Address calculation MEMORY L1 L2 L3 L4 I1 I2 I3 I4 FU 2 ( ALU) A1 A2 A3 A4 FU 3 (FP ADD) M1 M2 M3 FU 4 (FP MULT) D1 D2 FU 5 (/FP DIV) Common Data Bus (CDB)

12 Tomasulo s algorithm Reservation stations 12 Information on reservation stations: Reservation station Q n Station availability Operation to execute Busy Op Vj Value of operands j,k (valid if operands are ready) Readiness of operands j,k (Label of the reservation with the instruction that will generate the result) Vk Qj Qk Load/store operations have an additional field for indexed load/stores, e.g., M[R[AA] + Imm] R[BA] A : used to store the immediate and latter the effective load/store address Additional information stored in the RF: R0 Integer Data Data 0 Readiness Q 0 R1 Data 1 Q 1 Rn Data n... Q n F0 FP Data FP Data 0 Readiness Q 0 F1 FP Data 1 Q 1 Fn FP Data n... Q n Label each register as ready (value of zero) or not ready (indicating the reservation station holding the instruction that generates the value)

13 Tomasulo s algorithm Issue stages Decode the instruction Identify both the operation and the operands 2. Verify if the required functional unit has at least one reservation station available (i.e., which is not busy) If no reservation station is available (structural hazard) stall If there is a reservation station available issue the instruction indicating: a) operation to execute; b) value of all operands that are available, i.e., the value stored in the register file (RF); c) if an operand is not available, indicate the reservation station holding the instruction that will generate the corresponding value Reservation station Q n Station availability Operation to execute Busy Op Vj Value of operands j,k (valid if operands are ready) Readiness of operands j,k (Label of the reservation with the instruction that will generate the result) Vk Qj Qk

14 14 Tomasulo s algorithm Execute stage 1. If a reservation station has all operands available and there is a functional unit available, start executing the instruction 2. Monitor (snoop) writings to the common data bus (CDB); if a value is written on the CDB and that value is required by an instruction on a reservation station, retrieve it and store it on the corresponding field of the reservation station IF On the example: the functional unit FU5 (floating point division) writes a value to the CDB The reservation stations D1 and A3 hold instructions that require that value; the reservation stations take the result and store it on the corresponding fields... DIV.D F4,F0,F2 DIV.D F6,F4,F2 DADD.D F0,F4,F6... S1 S2 S3 S4 Address calculation MEMORY L1 L2 L3 L4 I1 I2 I3 I4 FU 2 ( ALU) ISSUE A1 A2 A3 A4 FU 3 (FP ADD) Common Data Bus (CDB) M1 M2 M3 FU 4 (FP MULT) Register File D1 D2 FU 5 (/FP DIV) WRITE RESULT FROM INSTRUCTION ON RESERVATION STATION D2

15 15 Tomasulo s algorithm Writing on the CDB 1. When writing a value on the CDB: Write the value plus The label of the reservation station where the instruction was stored Whenever a reservation station (or register) needs a value, it takes it from the CDB On the example: the functional unit FU5 (floating point division) writes a value to the CDB The reservation stations D1 and A3 hold instructions that require that value; the reservation stations take the result and store it on the corresponding fields... DIV.D F4,F0,F2 DIV.D F6,F4,F2 DADD.D F0,F4,F6... Reservation station A3 Station availability The DADD.D instruction is waiting for values produced by reservation stations D2 and D1; Reservation station D2 holds the first division Reservation station D1 holds the second division Operation to execute Busy DADD.D (invalid data) Value of operands j,k (valid if operands are ready) Readiness of operands j,k (Label of the reservation with the instruction that will generate the result) (invalid data) D2 D1 Wait for value being produced by the instruction on reservation station D1 Wait for value being produced by the instruction on reservation station D3

16 Tomasulo s algorithm Load/Store unit 16 Address calculation The load store unit is seen as a functional unit with read/write (load/store) buffers to the memory The load/store buffers can be seen as reservation stations S1 S2 S3 S4 Store buffers MEMORY Load buffers L1 L2 L3 L4 Common Data Bus (CDB)

17 Tomasulo s algorithm Solving hazards 17 RAW hazards: Solved by letting an instruction wait for the corresponding value on a reservation station WAR / WAW hazards Solved by renaming the registers (use of reservation stations)

18 Tomasulo s algorithm Example 18 Consider the execution of the instructions on the left on a processor with: n-pipelined functional units: 1x Integer ALU, with 1 cycle latency 1x FP multiplier, with 10 cycles latency 1x FP Adder/subtractor, with 2 cycles latency 1x /FP Division, with 40 cycles latency Load/store unit has 2 cycles latency (Add calc+mem access) Reservation stations: 3 load/store buffers 1 slot for integer operations 2 slots for FP multiplication/division 2 slots for FP addition/subtraction L.D F6,34(R2) L.D F2,45(R3) MUL.D F0,F2,F4 SUB.D F8,F6,F2 DIV.D F10,F0,F6 ADD.D F6,F8,F2 Similar architecture to the CDC6600, except that we are now using Tomasulo s algorithm instead of a Scoreboard

19 19 Tomasulo execution example Instruction Status (not required in Tomasulo, used only for illustration) Issue EX WB L.D F6,34(R2) R1 L.D F2,45(R3) R2 MUL.D F0,F2,F4 R3 SUB.D F8,F6,F2 DIV.D F10,F0,F6 F0 ADD.D F6,F8,F2 F2 F4 Reservation stations OpA OpB Res. station Address F6 Busy Op Vj Vk Qj Qk A F8 LD/ST buffer 1 F10 LD/ST buffer 2 LD/ST buffer 3 FP Mult/Div 1 FP Mult/Div 2 FP Adder 1 FP Adder 2 Register status Q

20 20 Tomasulo execution example Instruction Status (not required in Tomasulo, used only for illustration) Issue EX WB L.D F6,34(R2) 1 R1 L.D F2,45(R3) R2 MUL.D F0,F2,F4 R3 SUB.D F8,F6,F2 DIV.D F10,F0,F6 F0 ADD.D F6,F8,F2 F2 F4 Reservation stations OpA OpB Res. station Address F6 Busy Op Vj Vk Qj Qk A F8 LD/ST buffer 1 Yes L.D R2 0 Ready Ready 34 F10 LD/ST buffer 2 LD/ST buffer 3 FP Mult/Div 1 FP Mult/Div 2 FP Adder 1 FP Adder 2 Register status Q LD/ST1

21 21 Tomasulo execution example Instruction Status (not required in Tomasulo, used only for illustration) Issue EX WB L.D F6,34(R2) 1 Calculated effective address R1 L.D F2,45(R3) 2 R2 MUL.D F0,F2,F4 R3 SUB.D F8,F6,F2 DIV.D F10,F0,F6 F0 ADD.D F6,F8,F2 F2 F4 Reservation stations OpA OpB Res. station Address F6 Busy Op Vj Vk Qj Qk A F8 LD/ST buffer 1 Yes L.D R2 0 Ready Ready 34+R2 F10 LD/ST buffer 2 Yes L.D R3 0 Ready Ready 45 LD/ST buffer 3 FP Mult/Div 1 FP Mult/Div 2 FP Adder 1 FP Adder 2 Register status Q LD/ST2 LD/ST1

22 22 Tomasulo execution example Instruction Status (not required in Tomasulo, used only for illustration) Issue EX WB L.D F6,34(R2) 1 3 Finish loading the value R1 L.D F2,45(R3) 2 Calculated effective address R2 MUL.D F0,F2,F4 3 R3 SUB.D F8,F6,F2 DIV.D F10,F0,F6 ADD.D F6,F8,F2 Reservation stations OpA OpB Res. station Address F6 Busy Op Vj Vk Qj Qk A F8 Register status LD/ST buffer 1 Yes L.D R2 0 Ready Ready 34+R2 F10 LD/ST buffer 2 Yes L.D R3 0 Ready Ready 45+R3 LD/ST buffer 3 FP Mult/Div 1 Yes MUL.D - F4 LD/ST2 Ready Value of F4 is copied, which is FP Mult/Div 2 equivalent to register renaming FP Adder 1 FP Adder 2 F0 F2 F4 Q FP MULT1 LD/ST2 LD/ST1

23 23 Tomasulo execution example Instruction Status (not required in Tomasulo, used only for illustration) Issue EX WB L.D F6,34(R2) Write the result R1 L.D F2,45(R3) 2 4 R2 MUL.D F0,F2,F4 3 R3 SUB.D F8,F6,F2 4 DIV.D F10,F0,F6 ADD.D F6,F8,F2 Reservation stations OpA OpB Res. station Address F6 Register status LD/ST1 Busy Op Vj Vk Qj Qk A F8 FP ADD1 LD/ST buffer 1 F10 LD/ST buffer 2 Yes L.D R3 0 Ready Ready 34+R2 LD/ST buffer 3 FP Mult/Div 1 Yes MUL.D - F4 LD/ST2 Ready FP Mult/Div 2 FP Adder 1 Yes SUB.D F6 - Ready LD/ST2 Value of F6 is forward from CDB FP Adder 2 F0 F2 F4 Q FP MULT1 LD/ST2

24 24 Tomasulo execution example Instruction Status (not required in Tomasulo, used only for illustration) Issue EX WB L.D F6,34(R2) R1 L.D F2,45(R3) Write the result R2 MUL.D F0,F2,F cycles left R3 SUB.D F8,F6,F2 4 2 cycles left DIV.D F10,F0,F6 5 F0 ADD.D F6,F8,F2 F2 F4 Reservation stations OpA OpB Res. station Address F6 Busy Op Vj Vk Qj Qk A F8 LD/ST buffer 1 F10 LD/ST buffer 2 LD/ST buffer 3 FP Mult/Div 1 Yes MUL.D F2 F4 Ready Ready FP Mult/Div 2 Yes DIV.D - F6 FP M1 Ready FP Adder 1 Yes SUB.D F6 F2 Ready Ready FP Adder 2 Register status Q FP MULT1 LD/ST2 FP ADD1 FP MULT2 Value of F2 is forwarded from CDB; instructions become ready and starts executing

25 25 Tomasulo execution example Instruction Status (not required in Tomasulo, used only for illustration) Issue EX WB L.D F6,34(R2) R1 L.D F2,45(R3) R2 MUL.D F0,F2,F4 3 9 cycles left R3 SUB.D F8,F6,F2 4 1 cycles left DIV.D F10,F0,F6 5 F0 ADD.D F6,F8,F2 6 F2 F4 Reservation stations OpA OpB Res. station Address F6 Busy Op Vj Vk Qj Qk A F8 LD/ST buffer 1 F10 LD/ST buffer 2 LD/ST buffer 3 FP Mult/Div 1 Yes MUL.D F2 F4 Ready Ready FP Mult/Div 2 Yes DIV.D - F6 FP M1 Ready FP Adder 1 Yes SUB.D F6 F2 Ready Ready FP Adder 2 Yes ADD.D - F2 FP A1 Ready Register status Q FP MULT1 FP ADD2 FP ADD1 FP MULT2

26 26 Tomasulo execution example Instruction Status (not required in Tomasulo, used only for illustration) Issue EX WB L.D F6,34(R2) R1 L.D F2,45(R3) R2 MUL.D F0,F2,F4 3 8 cycles left R3 SUB.D F8,F6,F2 4 7 Finished execution DIV.D F10,F0,F6 5 F0 ADD.D F6,F8,F2 6 F2 F4 Reservation stations OpA OpB Res. station Address F6 Busy Op Vj Vk Qj Qk A F8 LD/ST buffer 1 F10 LD/ST buffer 2 LD/ST buffer 3 FP Mult/Div 1 Yes MUL.D F2 F4 Ready Ready FP Mult/Div 2 Yes DIV.D - F6 FP M1 Ready FP Adder 1 Yes SUB.D F6 F2 Ready Ready FP Adder 2 Yes ADD.D - F2 FP A1 Ready Register status Q FP MULT1 FP ADD2 FP ADD1 FP MULT2

27 27 Tomasulo execution example Instruction Status (not required in Tomasulo, used only for illustration) Issue EX WB L.D F6,34(R2) R1 L.D F2,45(R3) R2 MUL.D F0,F2,F4 3 7 cycles left R3 SUB.D F8,F6,F Write the result DIV.D F10,F0,F6 5 F0 ADD.D F6,F8,F2 6 2 cycles left F2 F4 Reservation stations OpA OpB Res. station Address F6 Register status FP MULT1 FP ADD2 Busy Op Vj Vk Qj Qk A F8 FP ADD1 LD/ST buffer 1 F10 FP MULT2 LD/ST buffer 2 LD/ST buffer 3 FP Mult/Div 1 Yes MUL.D F2 F4 Ready Ready FP Mult/Div 2 Yes DIV.D - F6 FP M1 Ready FP Adder 1 Value of F8 is forwarded from CDB; FP Adder 2 Yes ADD.D F8 F2 Ready Ready instruction becomes ready and starts executing Q

28 28 Tomasulo execution example Instruction Status (not required in Tomasulo, used only for illustration) Issue EX WB L.D F6,34(R2) R1 L.D F2,45(R3) R2 MUL.D F0,F2,F4 3 5 cycles left R3 SUB.D F8,F6,F DIV.D F10,F0,F6 5 F0 ADD.D F6,F8,F Finished execution F2 F4 Reservation stations OpA OpB Res. station Address F6 Busy Op Vj Vk Qj Qk A F8 LD/ST buffer 1 F10 LD/ST buffer 2 LD/ST buffer 3 FP Mult/Div 1 Yes MUL.D F2 F4 Ready Ready FP Mult/Div 2 Yes DIV.D - F6 FP M1 Ready FP Adder 1 FP Adder 2 Yes ADD.D F8 F2 Ready Ready Register status Q FP MULT1 FP ADD2 FP MULT2

29 29 Tomasulo execution example Instruction Status (not required in Tomasulo, used only for illustration) Issue EX WB L.D F6,34(R2) R1 L.D F2,45(R3) R2 MUL.D F0,F2,F4 3 4 cycles left R3 SUB.D F8,F6,F DIV.D F10,F0,F6 5 F0 ADD.D F6,F8,F Write the result F2 F4 Reservation stations OpA OpB Res. station Address F6 Busy Op Vj Vk Qj Qk A F8 LD/ST buffer 1 F10 LD/ST buffer 2 LD/ST buffer 3 FP Mult/Div 1 Yes MUL.D F2 F4 Ready Ready FP Mult/Div 2 Yes DIV.D - F6 FP M1 Ready FP Adder 1 FP Adder 2 Register status Q FP MULT1 FP ADD2 FP MULT2

30 30 Tomasulo execution example Instruction Status (not required in Tomasulo, used only for illustration) Issue EX WB L.D F6,34(R2) R1 L.D F2,45(R3) R2 MUL.D F0,F2,F Finished execution R3 SUB.D F8,F6,F DIV.D F10,F0,F6 5 F0 ADD.D F6,F8,F F2 F4 Reservation stations OpA OpB Res. station Address F6 Busy Op Vj Vk Qj Qk A F8 LD/ST buffer 1 F10 LD/ST buffer 2 LD/ST buffer 3 FP Mult/Div 1 Yes MUL.D F2 F4 Ready Ready FP Mult/Div 2 Yes DIV.D - F6 FP M1 Ready FP Adder 1 FP Adder 2 Register status Q FP MULT1 FP MULT2

31 31 Tomasulo execution example Instruction Status (not required in Tomasulo, used only for illustration) Issue EX WB L.D F6,34(R2) R1 L.D F2,45(R3) R2 MUL.D F0,F2,F Write the result R3 SUB.D F8,F6,F DIV.D F10,F0,F cycles left F0 ADD.D F6,F8,F F2 F4 Reservation stations OpA OpB Res. station Address F6 Busy Op Vj Vk Qj Qk A F8 Register status LD/ST buffer 1 F10 FP MULT2 LD/ST buffer 2 LD/ST buffer 3 FP Mult/Div 1 Value of F0 is forwarded from CDB; FP Mult/Div 2 Yes DIV.D F0 F6 Ready Ready instruction becomes ready and starts FP Adder 1 executing FP Adder 2 Q FP MULT1

32 32 Tomasulo execution example Instruction Status (not required in Tomasulo, used only for illustration) Issue EX WB L.D F6,34(R2) R1 L.D F2,45(R3) R2 MUL.D F0,F2,F R3 SUB.D F8,F6,F DIV.D F10,F0,F Finished execution F0 ADD.D F6,F8,F F2 F4 Reservation stations OpA OpB Res. station Address F6 Busy Op Vj Vk Qj Qk A F8 LD/ST buffer 1 F10 LD/ST buffer 2 LD/ST buffer 3 FP Mult/Div 1 FP Mult/Div 2 Yes DIV.D F0 F6 Ready Ready FP Adder 1 FP Adder 2 Register status Q FP MULT2

33 33 Tomasulo execution example Instruction Status (not required in Tomasulo, used only for illustration) Issue EX WB L.D F6,34(R2) R1 L.D F2,45(R3) R2 MUL.D F0,F2,F R3 SUB.D F8,F6,F DIV.D F10,F0,F Write the results F0 ADD.D F6,F8,F F2 F4 Reservation stations OpA OpB Res. station Address F6 Busy Op Vj Vk Qj Qk A F8 LD/ST buffer 1 F10 LD/ST buffer 2 LD/ST buffer 3 FP Mult/Div 1 FP Mult/Div 2 FP Adder 1 FP Adder 2 Register status Q FP MULT2

34 34 Tomasulo execution example Instruction Status (Tomasulo) Issue EX WB L.D F6,34(R2) IN ORDER: L.D F2,45(R3) Issue MUL.D F0,F2,F SUB.D F8,F6,F OUT OF ORDER: DIV.D F10,F0,F EX ADD.D F6,F8,F WB Instruction Status (Scoreboard) Issue Disp. EX WB L.D F6,34(R2) IN ORDER: L.D F2,45(R3) Issue MUL.D F0,F2,F SUB.D F8,F6,F OUT OF ORDER: DIV.D F10,F0,F Disp ADD.D F6,F8,F EX - WB ISSUE: Speedup = 13 6 = 2.17 WB: Speedup = = 1.09 te: Additional gains are achieved by easing the implementation of other architectural changes

35 Tomasulo vs Scoreboard 35 Scoreboard Tomasulo Structural hazards Stalls the pipeline Stalls the Pipeline WAW hazards Stalls the pipeline Solved by applying WAR hazards Delay writting the result Renaming (use of reservation stations) Control structure Centralized in the scoreboard Distributed in reservation stations Forwarding Hard to apply Automatically applied through the CDB Simultaneous writings Delayed writting may lead to structural hazards Simultaneous access to the CDB may lead to structural hazards Instruction window Smaller Larger

36 36 Next lesson Dynamic techniques to extract parallelism More on Tomasulo Dynamic branch prediction

Tomasulo Algorithm. Developed at IBM and first implemented in IBM s 360/91

Tomasulo Algorithm. Developed at IBM and first implemented in IBM s 360/91 Tomasulo Algorithm Developed at IBM and first implemented in IBM s 360/91 IBM wanted to use the existing compiler instead of a specialized compiler for high end machines. Tracks when operands are available

More information

Instruction Level Parallelism and Its. (Part II) ECE 154B

Instruction Level Parallelism and Its. (Part II) ECE 154B Instruction Level Parallelism and Its Exploitation (Part II) ECE 154B Dmitri Strukov ILP techniques not covered last week this week next week Scoreboard Technique Review Allow for out of order execution

More information

Chapter 3 Instruction-Level Parallelism and its Exploitation (Part 1)

Chapter 3 Instruction-Level Parallelism and its Exploitation (Part 1) Chapter 3 Instruction-Level Parallelism and its Exploitation (Part 1) ILP vs. Parallel Computers Dynamic Scheduling (Section 3.4, 3.5) Dynamic Branch Prediction (Section 3.3) Hardware Speculation and Precise

More information

Dynamic Scheduling. Differences between Tomasulo. Tomasulo Algorithm. CDC 6600 scoreboard. Or ydanicm ceshuldngi

Dynamic Scheduling. Differences between Tomasulo. Tomasulo Algorithm. CDC 6600 scoreboard. Or ydanicm ceshuldngi Dynamic Scheduling (or out-of-order execution) Dynamic Scheduling Or ydanicm ceshuldngi CDC 6600 scoreboard Instruction storage added to each functional execution unit Instructions issue to FU when no

More information

Advanced Pipelining and Instruction-Level Paralelism (2)

Advanced Pipelining and Instruction-Level Paralelism (2) Advanced Pipelining and Instruction-Level Paralelism (2) Riferimenti bibliografici Computer architecture, a quantitative approach, Hennessy & Patterson: (Morgan Kaufmann eds.) Tomasulo s Algorithm For

More information

Scoreboard Limitations!

Scoreboard Limitations! Scoreboard Limitations! No forwarding read from register! Structural hazards stall at issue! WAW hazard stall at issue!! WAR hazard stall at write! Inf3 Computer Architecture - 2015-2016 1 Dynamic Scheduling

More information

Differences between Tomasulo. Another Dynamic Algorithm: Tomasulo Organization. Reservation Station Components

Differences between Tomasulo. Another Dynamic Algorithm: Tomasulo Organization. Reservation Station Components Another Dynamic Algorithm: Tomasulo Algorithm Differences between Tomasulo Algorithm & Scoreboard For IBM 360/9 about 3 years after CDC 6600 Goal: High Performance without special compilers Differences

More information

Scoreboard Limitations

Scoreboard Limitations Scoreboard Limitations! No forwarding read from register! Structural hazards stall at issue! WAW hazard stall at issue! WAR hazard stall at write Inf3 Computer Architecture - 2016-2017 1 Dynamic Scheduling

More information

Lecture 16: Instruction Level Parallelism -- Dynamic Scheduling (OOO) via Tomasulo s Approach

Lecture 16: Instruction Level Parallelism -- Dynamic Scheduling (OOO) via Tomasulo s Approach Lecture 16: Instruction Level Parallelism -- Dynamic Scheduling (OOO) via Tomasulo s Approach CSE 564 Computer Architecture Summer 2017 Department of Computer Science and Engineering Yonghong Yan yan@oakland.edu

More information

Computer Architecture Spring 2016

Computer Architecture Spring 2016 Computer Architecture Spring 2016 Lecture 12: Dynamic Scheduling: Tomasulo s Algorithm Shuai Wang Department of Computer Science and Technology Nanjing University [Slides adapted from CS252, UC Berkeley

More information

CS152 Computer Architecture and Engineering Lecture 17 Advanced Pipelining: Tomasulo Algorithm

CS152 Computer Architecture and Engineering Lecture 17 Advanced Pipelining: Tomasulo Algorithm CS152 Computer Architecture and Engineering Lecture 17 Advanced Pipelining: Tomasulo Algorithm 2003-10-23 Dave Patterson (www.cs.berkeley.edu/~patterson) www-inst.eecs.berkeley.edu/~cs152/ CS 152 L17 Adv.

More information

Instruction Level Parallelism Part III

Instruction Level Parallelism Part III Course on: Advanced Computer Architectures Instruction Level Parallelism Part III Prof. Cristina Silvano Politecnico di Milano email: cristina.silvano@polimi.it 1 Outline of Part III Dynamic Scheduling

More information

Instruction Level Parallelism Part III

Instruction Level Parallelism Part III Course on: Advanced Computer Architectures Instruction Level Parallelism Part III Prof. Cristina Silvano Politecnico di Milano email: cristina.silvano@polimi.it 1 Outline of Part III Tomasulo Dynamic Scheduling

More information

Slide Set 9. for ENCM 501 in Winter Steve Norman, PhD, PEng

Slide Set 9. for ENCM 501 in Winter Steve Norman, PhD, PEng Slide Set 9 for ENCM 501 in Winter 2018 Steve Norman, PhD, PEng Electrical & Computer Engineering Schulich School of Engineering University of Calgary March 2018 ENCM 501 Winter 2018 Slide Set 9 slide

More information

Slide Set 8. for ENCM 501 in Winter Term, Steve Norman, PhD, PEng

Slide Set 8. for ENCM 501 in Winter Term, Steve Norman, PhD, PEng Slide Set 8 for ENCM 501 in Winter Term, 2017 Steve Norman, PhD, PEng Electrical & Computer Engineering Schulich School of Engineering University of Calgary Winter Term, 2017 ENCM 501 W17 Lectures: Slide

More information

EEC 581 Computer Architecture. Instruction Level Parallelism (3.4 & 3.5 Dynamic Scheduling)

EEC 581 Computer Architecture. Instruction Level Parallelism (3.4 & 3.5 Dynamic Scheduling) 1 EEC 581 Computer Architecture Instruction Level Parallelism (3.4 & 3.5 Dynamic Scheduling) Chansu Yu Electrical and Computer Engineering Cleveland State University Overview of Chap. 3 (again) Pipelined

More information

Out-of-Order Execution

Out-of-Order Execution 1 Out-of-Order Execution Several implementations out-of-order completion CDC 6600 with scoreboarding IBM 360/91 with Tomasulo s algorithm & reservation stations out-of-order completion leads to: imprecise

More information

Outline. 1 Reiteration. 2 Dynamic scheduling - Tomasulo. 3 Superscalar, VLIW. 4 Speculation. 5 ILP limitations. 6 What we have done so far.

Outline. 1 Reiteration. 2 Dynamic scheduling - Tomasulo. 3 Superscalar, VLIW. 4 Speculation. 5 ILP limitations. 6 What we have done so far. Outline 1 Reiteration Lecture 5: EIT090 Computer Architecture 2 Dynamic scheduling - Tomasulo Anders Ardö 3 Superscalar, VLIW EIT Electrical and Information Technology, Lund University Sept. 30, 2009 4

More information

Instruction Level Parallelism

Instruction Level Parallelism Instruction Level Parallelism Pipelining, Hazards Appendix C, HPe Outline Pipelining, Hazards Branch prediction Static and Dynamic Scheduling Speculation Compiler techniques, VLIW Limits of ILP. Pipelining

More information

Enhancing Performance in Multiple Execution Unit Architecture using Tomasulo Algorithm

Enhancing Performance in Multiple Execution Unit Architecture using Tomasulo Algorithm Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology ISSN 2320 088X IMPACT FACTOR: 6.017 IJCSMC,

More information

CS 152 Midterm 2 May 2, 2002 Bob Brodersen

CS 152 Midterm 2 May 2, 2002 Bob Brodersen CS 152 Midterm 2 May 2, 2002 Bob Brodersen Name Solutions Show your work if you want partial credit! Try all the problems, don t get stuck on one of them. Each one is worth 10 points. 1) 2) 3) 4) 5) 6)

More information

PIPELINING: BRANCH AND MULTICYCLE INSTRUCTIONS

PIPELINING: BRANCH AND MULTICYCLE INSTRUCTIONS PIPELINING: BRANCH AND MULTICYCLE INSTRUCTIONS Mahdi Nazm Bojnordi Assistant Professor School of Computing University of Utah CS/ECE 6810: Computer Architecture Overview Announcement Homework 1 submission

More information

Tomasulo Algorithm Based Out of Order Execution Processor

Tomasulo Algorithm Based Out of Order Execution Processor Tomasulo Algorithm Based Out of Order Execution Processor Bhavana P.Shrivastava MAaulana Azad National Institute of Technology, Department of Electronics and Communication ABSTRACT In this research work,

More information

Very Short Answer: (1) (1) Peak performance does or does not track observed performance.

Very Short Answer: (1) (1) Peak performance does or does not track observed performance. Very Short Answer: (1) (1) Peak performance does or does not track observed performance. (2) (1) Which is more effective, dynamic or static branch prediction? (3) (1) Do benchmarks remain valid indefinitely?

More information

06 1 MIPS Implementation Pipelined DLX and MIPS Implementations: Hardware, notation, hazards.

06 1 MIPS Implementation Pipelined DLX and MIPS Implementations: Hardware, notation, hazards. 06 1 MIPS Implementation 06 1 Material from Chapter 3 of H&P (for DLX). Material from Chapter 6 of P&H (for MIPS). line: (In this set.) Unpipelined DLX Implementation. (Diagram only.) Pipelined DLX and

More information

A VLIW Processor for Multimedia Applications

A VLIW Processor for Multimedia Applications A VLIW Processor for Multimedia Applications E. Holmann T. Yoshida A. Yamada Y. Shimazu Mitsubishi Electric Corporation, System LSI Laboratory 4-1 Mizuhara, Itami, Hyogo 664, Japan Outline Objective System

More information

On the Rules of Low-Power Design

On the Rules of Low-Power Design On the Rules of Low-Power Design (and How to Break Them) Prof. Todd Austin Advanced Computer Architecture Lab University of Michigan austin@umich.edu Once upon a time 1 Rules of Low-Power Design P = acv

More information

VLSI Design: 3) Explain the various MOSFET Capacitances & their significance. 4) Draw a CMOS Inverter. Explain its transfer characteristics

VLSI Design: 3) Explain the various MOSFET Capacitances & their significance. 4) Draw a CMOS Inverter. Explain its transfer characteristics 1) Explain why & how a MOSFET works VLSI Design: 2) Draw Vds-Ids curve for a MOSFET. Now, show how this curve changes (a) with increasing Vgs (b) with increasing transistor width (c) considering Channel

More information

Registers. Unit 12 Registers and Counters. Registers (D Flip-Flop based) Register Transfers (example not out of text) Accumulator Registers

Registers. Unit 12 Registers and Counters. Registers (D Flip-Flop based) Register Transfers (example not out of text) Accumulator Registers Unit 2 Registers and Counters Fundamentals of Logic esign EE2369 Prof. Eric Maconald Fall Semester 23 Registers Groups of flip-flops Can contain data format can be unsigned, 2 s complement and other more

More information

Modeling Digital Systems with Verilog

Modeling Digital Systems with Verilog Modeling Digital Systems with Verilog Prof. Chien-Nan Liu TEL: 03-4227151 ext:34534 Email: jimmy@ee.ncu.edu.tw 6-1 Composition of Digital Systems Most digital systems can be partitioned into two types

More information

EECS150 - Digital Design Lecture 9 - CPU Microarchitecture. CMOS Devices

EECS150 - Digital Design Lecture 9 - CPU Microarchitecture. CMOS Devices EECS150 - Digital Design Lecture 9 - CPU Microarchitecture Feb 17, 2009 John Wawrzynek Spring 2009 EECS150 - Lec9-cpu Page 1 CMOS Devices Review: Transistor switch-level models The gate acts like a capacitor.

More information

Pipeline design. Mehran Rezaei

Pipeline design. Mehran Rezaei Pipeline design Mehran Rezaei Shift Left 2 pc Opcode ExtOp Cont Unit RegDst Addr Addr2 Addr npcsle Reg ALUSrc Mem 2 OVF Branch ALUCtr MemtoReg Mem Funct Extension ALUOp ALU Cont Shift Left 2 ID EXE MEM

More information

CS 110 Computer Architecture. Finite State Machines, Functional Units. Instructor: Sören Schwertfeger.

CS 110 Computer Architecture. Finite State Machines, Functional Units. Instructor: Sören Schwertfeger. CS 110 Computer Architecture Finite State Machines, Functional Units Instructor: Sören Schwertfeger http://shtech.org/courses/ca/ School of Information Science and Technology SIST ShanghaiTech University

More information

An Adaptive Technique for Reducing Leakage and Dynamic Power in Register Files and Reorder Buffers

An Adaptive Technique for Reducing Leakage and Dynamic Power in Register Files and Reorder Buffers An Adaptive Technique for Reducing Leakage and Dynamic Power in Register Files and Reorder Buffers Shadi T. Khasawneh and Kanad Ghose Department of Computer Science State University of New York, Binghamton,

More information

A few questions to test your familiarity of Lab7 at the end of finishing all assigned parts of Lab 7

A few questions to test your familiarity of Lab7 at the end of finishing all assigned parts of Lab 7 EE457 Lab7 Questions page A few questions to test your familiarity of Lab7 at the end of finishing all assigned parts of Lab 7 1. A. In which parts or subparts of Lab 7 does the STALL signal cause the

More information

Advanced Devices. Registers Counters Multiplexers Decoders Adders. CSC258 Lecture Slides Steve Engels, 2006 Slide 1 of 20

Advanced Devices. Registers Counters Multiplexers Decoders Adders. CSC258 Lecture Slides Steve Engels, 2006 Slide 1 of 20 Advanced Devices Using a combination of gates and flip-flops, we can construct more sophisticated logical devices. These devices, while more complex, are still considered fundamental to basic logic design.

More information

CPE300: Digital System Architecture and Design

CPE300: Digital System Architecture and Design CPE300: Digital System Architecture and Design Fall 2011 MW 17:30-18:45 CBC C316 1-Bus Architecture and Datapath 10262011 http://www.egr.unlv.edu/~b1morris/cpe300/ 2 Outline 1-Bus Microarchitecture and

More information

Contents Circuits... 1

Contents Circuits... 1 Contents Circuits... 1 Categories of Circuits... 1 Description of the operations of circuits... 2 Classification of Combinational Logic... 2 1. Adder... 3 2. Decoder:... 3 Memory Address Decoder... 5 Encoder...

More information

(12) United States Patent (10) Patent No.: US 6,249,855 B1

(12) United States Patent (10) Patent No.: US 6,249,855 B1 USOO6249855B1 (12) United States Patent (10) Patent No.: Farrell et al. (45) Date of Patent: *Jun. 19, 2001 (54) ARBITER SYSTEM FOR CENTRAL OTHER PUBLICATIONS PROCESSING UNIT HAVING DUAL DOMINOED ENCODERS

More information

Pipelining. Improve performance by increasing instruction throughput Program execution order. Data access. Instruction. fetch. Data access.

Pipelining. Improve performance by increasing instruction throughput Program execution order. Data access. Instruction. fetch. Data access. Chapter 6 Pipelining Improve performance by increasing instrction throghpt Program eection order Time (in instrctions) lw $, ($) Instrction fetch 2 4 6 8 2 4 6 8 ALU Data access lw $2, 2($) 8 ns Instrction

More information

A New Family of High-Performance Parallel Decimal Multipliers*

A New Family of High-Performance Parallel Decimal Multipliers* A New Family of High-Performance Parallel Decimal Multipliers* Alvaro Vázquez, Elisardo Antelo Dept. of Electronic and Computer Science University of Santiago de Compostela Spain alvaro@dec.usc.es elisardo@dec.usc.es

More information

Low Power VLSI Circuits and Systems Prof. Ajit Pal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

Low Power VLSI Circuits and Systems Prof. Ajit Pal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Low Power VLSI Circuits and Systems Prof. Ajit Pal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture No. # 29 Minimizing Switched Capacitance-III. (Refer

More information

An automatic synchronous to asynchronous circuit convertor

An automatic synchronous to asynchronous circuit convertor An automatic synchronous to asynchronous circuit convertor Charles Brej Abstract The implementation methods of asynchronous circuits take time to learn, they take longer to design and verifying is very

More information

CS/ECE 250: Computer Architecture. Basics of Logic Design: ALU, Storage, Tristate. Benjamin Lee

CS/ECE 250: Computer Architecture. Basics of Logic Design: ALU, Storage, Tristate. Benjamin Lee CS/ECE 25: Computer Architecture Basics of Logic esign: ALU, Storage, Tristate Benjamin Lee Slides based on those from Alvin Lebeck, aniel, Andrew Hilton, Amir Roth, Gershon Kedem Homework #3 ue Mar 7,

More information

6.3 Sequential Circuits (plus a few Combinational)

6.3 Sequential Circuits (plus a few Combinational) 6.3 Sequential Circuits (plus a few Combinational) Logic Gates: Fundamental Building Blocks Introduction to Computer Science Robert Sedgewick and Kevin Wayne Copyright 2005 http://www.cs.princeton.edu/introcs

More information

Chapter 5 Sequential Circuits

Chapter 5 Sequential Circuits Logic and Computer Design Fundamentals Chapter 5 Sequential Circuits Part 2 Sequential Circuit Design Charles Kime & Thomas Kaminski 28 Pearson Education, Inc. (Hyperlinks are active in View Show mode)

More information

Slide Set 6. for ENCM 369 Winter 2018 Section 01. Steve Norman, PhD, PEng

Slide Set 6. for ENCM 369 Winter 2018 Section 01. Steve Norman, PhD, PEng Slide Set 6 for ENCM 369 Winter 2018 Section 01 Steve Norman, PhD, PEng Electrical & Computer Engineering Schulich School of Engineering University of Calgary February 2018 ENCM 369 Winter 2018 Section

More information

Contents Slide Set 6. Introduction to Chapter 7 of the textbook. Outline of Slide Set 6. An outline of the first part of Chapter 7

Contents Slide Set 6. Introduction to Chapter 7 of the textbook. Outline of Slide Set 6. An outline of the first part of Chapter 7 CM 69 W4 Section Slide Set 6 slide 2/9 Contents Slide Set 6 for CM 69 Winter 24 Lecture Section Steve Norman, PhD, PEng Electrical & Computer Engineering Schulich School of Engineering University of Calgary

More information

BUSES IN COMPUTER ARCHITECTURE

BUSES IN COMPUTER ARCHITECTURE BUSES IN COMPUTER ARCHITECTURE The processor, main memory, and I/O devices can be interconnected by means of a common bus whose primary function is to provide a communication path for the transfer of data.

More information

Objectives. Combinational logics Sequential logics Finite state machine Arithmetic circuits Datapath

Objectives. Combinational logics Sequential logics Finite state machine Arithmetic circuits Datapath Objectives Combinational logics Sequential logics Finite state machine Arithmetic circuits Datapath In the previous chapters we have studied how to develop a specification from a given application, and

More information

Lecture 0: Organization

Lecture 0: Organization 581365 Tietokoneen rakenne Computer Organization II Spring 2010 Tiina Niklander Matemaattis-luonnontieteellinen tiedekunta Computer Organization II Advanced (master) level course! Prerequisite: Computer

More information

Logic Devices for Interfacing, The 8085 MPU Lecture 4

Logic Devices for Interfacing, The 8085 MPU Lecture 4 Logic Devices for Interfacing, The 8085 MPU Lecture 4 1 Logic Devices for Interfacing Tri-State devices Buffer Bidirectional Buffer Decoder Encoder D Flip Flop :Latch and Clocked 2 Tri-state Logic Outputs

More information

Altera s Max+plus II Tutorial

Altera s Max+plus II Tutorial Altera s Max+plus II Tutorial Written by Kris Schindler To accompany Digital Principles and Design (by Donald D. Givone) 8/30/02 1 About Max+plus II Altera s Max+plus II is a powerful simulation package

More information

Chapter 4. Logic Design

Chapter 4. Logic Design Chapter 4 Logic Design 4.1 Introduction. In previous Chapter we studied gates and combinational circuits, which made by gates (AND, OR, NOT etc.). That can be represented by circuit diagram, truth table

More information

ECSE-323 Digital System Design. Datapath/Controller Lecture #1

ECSE-323 Digital System Design. Datapath/Controller Lecture #1 1 ECSE-323 Digital System Design Datapath/Controller Lecture #1 2 Synchronous Digital Systems are often designed in a modular hierarchical fashion. The system consists of modular subsystems, each of which

More information

CHAPTER 4 RESULTS & DISCUSSION

CHAPTER 4 RESULTS & DISCUSSION CHAPTER 4 RESULTS & DISCUSSION 3.2 Introduction This project aims to prove that Modified Baugh-Wooley Two s Complement Signed Multiplier is one of the high speed multipliers. The schematic of the multiplier

More information

Flip Flop. S-R Flip Flop. Sequential Circuits. Block diagram. Prepared by:- Anwar Bari

Flip Flop. S-R Flip Flop. Sequential Circuits. Block diagram. Prepared by:- Anwar Bari Sequential Circuits The combinational circuit does not use any memory. Hence the previous state of input does not have any effect on the present state of the circuit. But sequential circuit has memory

More information

CSE140L: Components and Design Techniques for Digital Systems Lab. CPU design and PLDs. Tajana Simunic Rosing. Source: Vahid, Katz

CSE140L: Components and Design Techniques for Digital Systems Lab. CPU design and PLDs. Tajana Simunic Rosing. Source: Vahid, Katz CSE140L: Components and Design Techniques for Digital Systems Lab CPU design and PLDs Tajana Simunic Rosing Source: Vahid, Katz 1 Lab #3 due Lab #4 CPU design Today: CPU design - lab overview PLDs Updates

More information

Midterm Exam 15 points total. March 28, 2011

Midterm Exam 15 points total. March 28, 2011 Midterm Exam 15 points total March 28, 2011 Part I Analytical Problems 1. (1.5 points) A. Convert to decimal, compare, and arrange in ascending order the following numbers encoded using various binary

More information

Combinational Logic Design

Combinational Logic Design Lab #2 Combinational Logic Design Objective: To introduce the design of some fundamental combinational logic building blocks. Preparation: Read the following experiment and complete the circuits where

More information

AC103/AT103 ANALOG & DIGITAL ELECTRONICS JUN 2015

AC103/AT103 ANALOG & DIGITAL ELECTRONICS JUN 2015 Q.2 a. Draw and explain the V-I characteristics (forward and reverse biasing) of a pn junction. (8) Please refer Page No 14-17 I.J.Nagrath Electronic Devices and Circuits 5th Edition. b. Draw and explain

More information

MODULE 3. Combinational & Sequential logic

MODULE 3. Combinational & Sequential logic MODULE 3 Combinational & Sequential logic Combinational Logic Introduction Logic circuit may be classified into two categories. Combinational logic circuits 2. Sequential logic circuits A combinational

More information

Chapter 05: Basic Processing Units Control Unit Design Organization. Lesson 11: Multiple Bus Organisation

Chapter 05: Basic Processing Units Control Unit Design Organization. Lesson 11: Multiple Bus Organisation Chapter 05: Basic Processing Units Control Unit Design Organization Lesson 11: Multiple Bus Organisation Objective Understand multiple bus organisation Learn how the number of independent steps can be

More information

CS 61C: Great Ideas in Computer Architecture

CS 61C: Great Ideas in Computer Architecture CS 6C: Great Ideas in Computer Architecture Combinational and Sequential Logic, Boolean Algebra Instructor: Alan Christopher 7/23/24 Summer 24 -- Lecture #8 Review of Last Lecture OpenMP as simple parallel

More information

FPGA Design. Part I - Hardware Components. Thomas Lenzi

FPGA Design. Part I - Hardware Components. Thomas Lenzi FPGA Design Part I - Hardware Components Thomas Lenzi Approach We believe that having knowledge of the hardware components that compose an FPGA allow for better firmware design. Being able to visualise

More information

RAZOR: CIRCUIT-LEVEL CORRECTION OF TIMING ERRORS FOR LOW-POWER OPERATION

RAZOR: CIRCUIT-LEVEL CORRECTION OF TIMING ERRORS FOR LOW-POWER OPERATION RAZOR: CIRCUIT-LEVEL CORRECTION OF TIMING ERRORS FOR LOW-POWER OPERATION Shohaib Aboobacker TU München 22 nd March 2011 Based on Razor: A Low-Power Pipeline Based on Circuit-Level Timing Speculation Dan

More information

1ms Column Parallel Vision System and It's Application of High Speed Target Tracking

1ms Column Parallel Vision System and It's Application of High Speed Target Tracking Proceedings of the 2(X)0 IEEE International Conference on Robotics & Automation San Francisco, CA April 2000 1ms Column Parallel Vision System and It's Application of High Speed Target Tracking Y. Nakabo,

More information

1. True/False Questions (10 x 1p each = 10p) (a) I forgot to write down my name and student ID number.

1. True/False Questions (10 x 1p each = 10p) (a) I forgot to write down my name and student ID number. CprE 281: Digital Logic Midterm 2: Friday Oct 30, 2015 Student Name: Student ID Number: Lab Section: Mon 9-12(N) Mon 12-3(P) Mon 5-8(R) Tue 11-2(U) (circle one) Tue 2-5(M) Wed 8-11(J) Wed 6-9(Y) Thur 11-2(Q)

More information

Sequencing and Control

Sequencing and Control Sequencing and Control Lan-Da Van ( 范倫達 ), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C. Spring, 2016 ldvan@cs.nctu.edu.tw http://www.cs.nctu.edu.tw/~ldvan/ Source:

More information

Combinational vs Sequential

Combinational vs Sequential Combinational vs Sequential inputs X Combinational Circuits outputs Z A combinational circuit: At any time, outputs depends only on inputs Changing inputs changes outputs No regard for previous inputs

More information

OUT-OF-ORDER processors with precise exceptions

OUT-OF-ORDER processors with precise exceptions TRANSACTIONS ON COMPUTER, VOL. X, NO. Y, FEBRUARY 2009 1 Store Buffer Design for Multibanked Data Caches Enrique Torres, Member, IEEE, Pablo Ibáñez, Member, IEEE, Víctor Viñals-Yúfera, Member, IEEE, and

More information

University of Pennsylvania Department of Electrical and Systems Engineering. Digital Design Laboratory. Lab8 Calculator

University of Pennsylvania Department of Electrical and Systems Engineering. Digital Design Laboratory. Lab8 Calculator University of Pennsylvania Department of Electrical and Systems Engineering Digital Design Laboratory Purpose Lab Calculator The purpose of this lab is: 1. To get familiar with the use of shift registers

More information

THE USE OF forward error correction (FEC) in optical networks

THE USE OF forward error correction (FEC) in optical networks IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 52, NO. 8, AUGUST 2005 461 A High-Speed Low-Complexity Reed Solomon Decoder for Optical Communications Hanho Lee, Member, IEEE Abstract

More information

Chapter 4 (Part I) The Processor. Baback Izadi Division of Engineering Programs

Chapter 4 (Part I) The Processor. Baback Izadi Division of Engineering Programs EGC442 Introdction to Compter Architectre Chapter 4 (Part I) The Processor Baback Izadi Division of Engineering Programs bai@engr.newpaltz.ed Introdction CPU performance factors Instrction cont Determined

More information

Read-only memory (ROM) Digital logic: ALUs Sequential logic circuits. Don't cares. Bus

Read-only memory (ROM) Digital logic: ALUs Sequential logic circuits. Don't cares. Bus Digital logic: ALUs Sequential logic circuits CS207, Fall 2004 October 11, 13, and 15, 2004 1 Read-only memory (ROM) A form of memory Contents fixed when circuit is created n input lines for 2 n addressable

More information

Figure 1: segment of an unprogrammed and programmed PAL.

Figure 1: segment of an unprogrammed and programmed PAL. PROGRAMMABLE ARRAY LOGIC The PAL device is a special case of PLA which has a programmable AND array and a fixed OR array. The basic structure of Rom is same as PLA. It is cheap compared to PLA as only

More information

Last time, we saw how latches can be used as memory in a circuit

Last time, we saw how latches can be used as memory in a circuit Flip-Flops Last time, we saw how latches can be used as memory in a circuit Latches introduce new problems: We need to know when to enable a latch We also need to quickly disable a latch In other words,

More information

Digilent Nexys-3 Cellular RAM Controller Reference Design Overview

Digilent Nexys-3 Cellular RAM Controller Reference Design Overview Digilent Nexys-3 Cellular RAM Controller Reference Design Overview General Overview This document describes a reference design of the Cellular RAM (or PSRAM Pseudo Static RAM) controller for the Digilent

More information

Page 1) 7 points Page 2) 16 points Page 3) 22 points Page 4) 21 points Page 5) 22 points Page 6) 12 points. TOTAL out of 100

Page 1) 7 points Page 2) 16 points Page 3) 22 points Page 4) 21 points Page 5) 22 points Page 6) 12 points. TOTAL out of 100 EE3701 Dr. Gugel Spring 2014 Exam II ast Name First Open book/open notes, 90-minutes. Calculators are permitted. Write on the top of each page only. Page 1) 7 points Page 2) 16 points Page 3) 22 points

More information

MC9211 Computer Organization

MC9211 Computer Organization MC9211 Computer Organization Unit 2 : Combinational and Sequential Circuits Lesson2 : Sequential Circuits (KSB) (MCA) (2009-12/ODD) (2009-10/1 A&B) Coverage Lesson2 Outlines the formal procedures for the

More information

Branch management into micropipeline joint dot

Branch management into micropipeline joint dot Applied Innovations and Technologies Peer-reviewed Open access journal www.academicpublishingplatforms.com The primary version of the journal is the on-line version ATI - Applied Technologies Innovations

More information

12-bit Wallace Tree Multiplier CMPEN 411 Final Report Matthew Poremba 5/1/2009

12-bit Wallace Tree Multiplier CMPEN 411 Final Report Matthew Poremba 5/1/2009 12-bit Wallace Tree Multiplier CMPEN 411 Final Report Matthew Poremba 5/1/2009 Project Overview This project was originally titled Fast Fourier Transform Unit, but due to space and time constraints, the

More information

ACT-R ACT-R. Core Components of the Architecture. Core Commitments of the Theory. Chunks. Modules

ACT-R ACT-R. Core Components of the Architecture. Core Commitments of the Theory. Chunks. Modules ACT-R & A 1000 Flowers ACT-R Adaptive Control of Thought Rational Theory of cognition today Cognitive architecture Programming Environment 2 Core Commitments of the Theory Modularity (and what the modules

More information

Lab #12: 4-Bit Arithmetic Logic Unit (ALU)

Lab #12: 4-Bit Arithmetic Logic Unit (ALU) Lab #12: 4-Bit Arithmetic Logic Unit (ALU) ECE/COE 0501 Date of Experiment: 4/3/2017 Report Written: 4/5/2017 Submission Date: 4/10/2017 Nicholas Haver nicholas.haver@pitt.edu 1 H a v e r PURPOSE The purpose

More information

NH 67, Karur Trichy Highways, Puliyur C.F, Karur District UNIT-III SEQUENTIAL CIRCUITS

NH 67, Karur Trichy Highways, Puliyur C.F, Karur District UNIT-III SEQUENTIAL CIRCUITS NH 67, Karur Trichy Highways, Puliyur C.F, 639 114 Karur District DEPARTMENT OF ELETRONICS AND COMMUNICATION ENGINEERING COURSE NOTES SUBJECT: DIGITAL ELECTRONICS CLASS: II YEAR ECE SUBJECT CODE: EC2203

More information

Logic Design II (17.342) Spring Lecture Outline

Logic Design II (17.342) Spring Lecture Outline Logic Design II (17.342) Spring 2012 Lecture Outline Class # 03 February 09, 2012 Dohn Bowden 1 Today s Lecture Registers and Counters Chapter 12 2 Course Admin 3 Administrative Admin for tonight Syllabus

More information

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING DRONACHARYA GROUP OF INSTITUTIONS, GREATER NOIDA Affiliated to Mahamaya Technical University, Noida Approved by AICTE DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING Lab Manual for Computer Organization Lab

More information

ELEN Electronique numérique

ELEN Electronique numérique ELEN0040 - Electronique numérique Patricia ROUSSEAUX Année académique 2014-2015 CHAPITRE 6 Registers and Counters ELEN0040 6-277 Design of a modulo-8 binary counter using JK Flip-flops 3 bits are required

More information

Microprocessor Design

Microprocessor Design Microprocessor Design Principles and Practices With VHDL Enoch O. Hwang Brooks / Cole 2004 To my wife and children Windy, Jonathan and Michelle Contents 1. Designing a Microprocessor... 2 1.1 Overview

More information

We are here. Assembly Language. Processors Arithmetic Logic Units. Finite State Machines. Circuits Gates. Transistors

We are here. Assembly Language. Processors Arithmetic Logic Units. Finite State Machines. Circuits Gates. Transistors CSC258 Week 5 1 We are here Assembly Language Processors Arithmetic Logic Units Devices Finite State Machines Flip-flops Circuits Gates Transistors 2 Circuits using flip-flops Now that we know about flip-flops

More information

MODELING OF ADC ARCHITECTURES IN HDL LANGUAGES

MODELING OF ADC ARCHITECTURES IN HDL LANGUAGES MODELING OF ADC ARCHITECTURES IN HDL LANGUAGES Marco Oliveira, Nuno Franca Modeling Group, Chipidea Microelectronics, Inc. Taguspark, Edifício Inovação IV, sala 733, 2780-920 Porto Salvo, Portugal Phone

More information

PROCESSOR BASED TIMING SIGNAL GENERATOR FOR RADAR AND SENSOR APPLICATIONS

PROCESSOR BASED TIMING SIGNAL GENERATOR FOR RADAR AND SENSOR APPLICATIONS PROCESSOR BASED TIMING SIGNAL GENERATOR FOR RADAR AND SENSOR APPLICATIONS Application Note ABSTRACT... 3 KEYWORDS... 3 I. INTRODUCTION... 4 II. TIMING SIGNALS USAGE AND APPLICATION... 5 III. FEATURES AND

More information

AN ABSTRACT OF THE THESIS OF

AN ABSTRACT OF THE THESIS OF AN ABSTRACT OF THE THESIS OF Licheng Zhang for the degree of Master of Science in Electrical and Computer Engineering presented on June 7, 1989. Title: The Design of A Reduced Instruction Set Computer

More information

Chapter 3 Unit Combinational

Chapter 3 Unit Combinational EE 200: Digital Logic Circuit Design Dr Radwan E Abdel-Aal, COE Logic and Computer Design Fundamentals Chapter 3 Unit Combinational 5 Registers Logic and Design Counters Part Implementation Technology

More information

Logic Design Viva Question Bank Compiled By Channveer Patil

Logic Design Viva Question Bank Compiled By Channveer Patil Logic Design Viva Question Bank Compiled By Channveer Patil Title of the Practical: Verify the truth table of logic gates AND, OR, NOT, NAND and NOR gates/ Design Basic Gates Using NAND/NOR gates. Q.1

More information

Section 6.8 Synthesis of Sequential Logic Page 1 of 8

Section 6.8 Synthesis of Sequential Logic Page 1 of 8 Section 6.8 Synthesis of Sequential Logic Page of 8 6.8 Synthesis of Sequential Logic Steps:. Given a description (usually in words), develop the state diagram. 2. Convert the state diagram to a next-state

More information

Hardware Implementation of Viterbi Decoder for Wireless Applications

Hardware Implementation of Viterbi Decoder for Wireless Applications Hardware Implementation of Viterbi Decoder for Wireless Applications Bhupendra Singh 1, Sanjeev Agarwal 2 and Tarun Varma 3 Deptt. of Electronics and Communication Engineering, 1 Amity School of Engineering

More information

Multicore Design Considerations

Multicore Design Considerations Multicore Design Considerations Multicore: The Forefront of Computing Technology We re not going to have faster processors. Instead, making software run faster in the future will mean using parallel programming

More information

AN INTRODUCTION TO DIGITAL COMPUTER LOGIC

AN INTRODUCTION TO DIGITAL COMPUTER LOGIC SUPPLEMENTRY HPTER 1 N INTRODUTION TO DIGITL OMPUTER LOGI J K J K FREE OMPUTER HIPS FREE HOOLTE HIPS I keep telling you Gwendolyth, you ll never attract today s kids that way. S1.0 INTRODUTION 1 2 Many

More information

EECS150 - Digital Design Lecture 3 Synchronous Digital Systems Review. Announcements

EECS150 - Digital Design Lecture 3 Synchronous Digital Systems Review. Announcements EECS150 - Digital Design Lecture 3 Synchronous Digital Systems Review September 1, 2011 Elad Alon Electrical Engineering and Computer Sciences University of California, Berkeley http://www-inst.eecs.berkeley.edu/~cs150

More information