Tomasulo Algorithm Based Out of Order Execution Processor
|
|
- Rosemary Black
- 6 years ago
- Views:
Transcription
1 Tomasulo Algorithm Based Out of Order Execution Processor Bhavana P.Shrivastava MAaulana Azad National Institute of Technology, Department of Electronics and Communication ABSTRACT In this research work, Tomasulo algorithm based out of order execution processor is implemented. Tomasulo algorithm is the basic technique that is used to implement Out of order (OOO) execution in modern microprocessors. This thesis explains the idea behind OOO execution and how Tomasulo s algorithm implements it. The algorithm describes the working of the instruction dispatch and handling techniques in a processor. It allows sequential instructions that would normally be stalled due to certain dependencies to execute.. In Tomasulo algorithm, reservation stations are used to solve data hazards. Processor s performance is improved and available memory bandwidth is used more effectively. Processor is built in the hardware description language, Verilog. There are two phases to this thesis: Firstly, the various stages of research are outlined, focusing on dependencies and hazards. Secondly, a detailed design description is given, outlining the specifications, requirements, design procedure and simulation results stages.design and verification of processor has been done successfully using Verilog on Xilinx 13.2 platform. The processor is verified in both simulation and synthesis with the help of test programs. This design aimed to be implemented on Xilinx Spartan 3E XC3S1500E FPGA. Keywords-Tomasulo Algorithm, Register Renaming, Common Data Bus, Out of Order Execution. 1. INTRODUCTION 1.1 Tomasulo Algorithm The formal design of an out-of-order processing unit based on Tomasulo s algorithm. All related techniques such as register renaming are used in modern microprocessors to keep multiple or deeply pipelined execution units busy by executing instructions in data-flow order, rather than sequential order. The complex variability of instruction flow in out-of-order processors presents a significant opportunity for undetected errors, compared to an in-order pipelined machine where the flow of instructions is fixed and orderly. Tomasulo based processor solves the prominent problems of dependencies, hazards and stalls.for the implementation of Tomasulo's Algorithm following step has to be followed. 1. Instructions are issued sequentially so that the effects of a sequence of instructions such as exceptions raised by these instructions occur in the same order as they would in a non-pipelined processor, regardless of the fact that they are being executed non-sequentially. 2. All general-purpose and reservation station registers hold either real or virtual values (called as tags here). If a real value is unavailable to a destination register during the issue stage, a virtual value (tag) is initially used. The functional unit that is computing the real value is assigned as the virtual value (tag). The virtual register values are converted to real values as soon as the designated functional unit completes its computation and puts it on the bus. 3. Functional units use reservation stations with multiple slots. Each slot holds information needed to execute a single instruction, including the operation and the operands. The functional unit begins processing when it is free and when all source operands needed for an instruction are real. The design implemented in this paper can decode four types of instructions namely ADD/SUB, MUL, FETCH, WRITE. So there are four functional blocks in the design to perform addition/subtraction, multiplication, fetching from memory and writing to the memory. The data is communicated through registers. There are four registers available for use in instructions. The registers are arranged and handled using the completion file. The completion file is used to handle special conditions such as overflow and page 43 Bhavana P.Shrivastava
2 fault. The instruction dispatch unit reads the instructions from the instruction queue in order and decodes them. The instruction dispatch unit is designed to dispatch to instructions in parallel, on two instruction buses. Fig. 1.1: A model of an implementation of Tomasulo s algorithm. 1.2 OUT OF ORDER EXECUTION This algorithm is the basic technique that is used to implement Out Of Order (OOO) execution in modern microprocessors. To achieve greater throughput of instructions, superscalar microprocessors use several functional units that can execute instructions in parallel. However, if two instructions depend on each other one of them has to wait until the other has finished. In this case, one functional unit is idle. But if a different instruction, potentially following the other two in the instruction sequence, does not depend on their results, then it can be executed in parallel on the free functional unit. Data hazards must be handled properly, In general, a data hazard arises when changing the instruction execution order influences the result of the computation. Tomasulo s algorithm was designed to avoid such problems. 2. LITERATURE SURVEY The Systems 360 computer family is where Tomasulo s Algorithm originated. Here an overview of how, why and when Tomasulo s Algorithm was developed is discussed[]1,2]. The IBM System/360 is a family of computer systems, developed in the 1960 s, where the chief architect was the well-known Gene Amdahl [16]. Prior to the announcement of this family, computers were custom made and designed independently. This development of computers indicated that a new revelation was underway and would change the computer industry forever. Initially only 6 models were announced: 30, 40, 50, 60, 62, and 70, whereas in actual fact 14 models were produced: 20, 22, 25, 30, 40, 44, 50, 65, 67, 75, 85, 91, 95 and the 195 [16]. Despite the models individual differences, the System 360 family employed the same user-instruction set. The larger machines dealt with complex instructions through hardware whilst the smaller ones dealt with them in micro-code, where such an instruction as multiplication would be completed by repeated addition. And as we know today, this was an extremely inefficient way to execute a multiplication instruction [10]. (It was also rumored that the smaller 360 machines performed addition by repeated increments! (i.e. x + 5! add a 1 bit five times!) [13].The 44 Bhavana P.Shrivastava
3 System 360 employed a variety of operating systems [14] like DOS/360, OS/360, CP-67 (later VM/370), MTS, CRJE, TSO, Amdahl s UTS.The OS/360 proved to be the most popular. The 360 computer family had a very limited number of registers that initially consisted of only four double precision floating-point registers. Consequently compiler scheduling was not particularly effective. On top of this, even the more optimal 360 designs took considerable time to access memory and compute long floating point equations. Due to the number of constraining factors, this prompted programmers to develop a solution, so as to attain maximum efficiency [10,11,12]. The ultimate solution to the problems comes in the form of Tomasulo s Algorithm. 3. IMPLEMENTATION OF THE LOGIC DESIGN Xilinx ISE 13.2 is used for implementing all the modules used in the architecture of Tomasulo based out of order execution processor using Verilog HDL. 3.1 SIMULATION AND SYNTHESIS RESULTS RTL schematic of Tomasulo based processor is shown in Fig Device utilization summary is given below Number of Slices: 537 out of % Number of Slice Flip Flops: 407 out of % Number of 4 input LUTs: 838 out of % Number of IOs: 8 Number of bonded IOBs: 8 out of 232 3% IOB Flip Flops: 8 Number of GCLKs: 1 out of 24 4% Fig. 3.1: Tomasulo block Fetch station- RTL schematic of Fetch block is shown in Fig Device utilization summary is given below. Number of Slices: % Number of Slice Flip Flops: % Number of 4 input LUTs: of % Number of IOs: 34 Number of bonded IOBs: out of 8 out of 24 out Fig.3.2: Fetch block 3.1.2Instruction Decode Unit RTL schematic of Instruction Decode Unit is shown in Fig. 3.3 a Device utilization summary is given below. 45 Bhavana P.Shrivastava
4 Fig.3.3: Instruction Decode Unit Number of Slices: 67 out of % Number of Slice Flip Flops: 57 out of % Number of 4 input LUTs: 129 out of % Number of IOs: 186 Number of bonded IOBs: 186 out of % 3.1.2Reservation station- RTL schematic of Reservation station is shown in Fig3.4.Device utilization summary is also given below Fig.3.4 Instruction Decode Unit 3.1.3Register bank RTL schematic of Register bank is shown in Fig.3.5. Device utilization summary is given below. Fig.3.5Register bank Number of Slices: 0 out of % Number of IOs: 20 Number of bonded IOBs: 1 out of 232 0% Number of Slices: 141 out of % Number of Slice Flip Flops: 128 out of % Number of 4 input LUTs: 168 out of % Number of IOs: of bonded IOBs: 60 out of % Number of GCLKs: 1 out of 24 4% 60Number 3.1.4Write Block RTL schematic of Write Block is shown in Fig. 3.6 Device utilization summary is given below. Fig.3.6Write Block Number of Slices: 9 out of % Number of 4 input LUTs: 16 out of % Number of IOs: 60 Number of bonded IOBs: 60 out of % 46 Bhavana P.Shrivastava
5 The presented Tomasulo based processor avoids the stalling of instruction that can cause due to different type of data hazards. By this the performance of processor is improved (shown in Fig. 3.7 and Fig. 3.8). Tomasulo based Processor completes its execution in 150ns (shown in Fig 3.8) But processor with stalling cannot complete its execution in the same time. This takes more time to complete the execution (shown in Fig. 3.7). Therefore it is obvious that the presented Tomasulo based processor improves the performance 4. CONCLUSION The work presented an idea about the Tomasulo algorithm based out of order execution processor for the out of order execution. The Tomasulo based processor has been synthesized in Xilinx 13.2 and have been simulated in simulation environment of Xilinx ISE. The device chosen for synthesis was XC3S1500E. Coding is done in Verilog HDL. The processor improves the performance and avoids the stalling of instruction due to different hazards. It uses register renaming to overcome the hazards problem. Therefore the Tomasulo Algorithm based out of order execution processor is more efficient and very useful in modern day processor design. Fig. 3.7: Simulation result of the test program 1 for with stalling 47 Bhavana P.Shrivastava Fig 3.8: Simulation result of the test program 1 for without stalling
6 REFERENCES [1] K. Aasaraai and A. Moshovos, Towards a viable out-of-order soft core:copy-free, checkpointed register renaming, Proceedingss the Field-Programmable Logic and Applications, pp , [2] S. Petit, J. Sahuquillo, P. Lo pez, R. Ubal, and J. Duato, A Complexity-Effective Out-of-Order Retirement Microarchitecture, IEEE Trans. Computers, vol. 58, no. 12, pp ,Dec [3] R. Plyaskin and A. Herkersdorf. Context-aware compiled simulation of out-of-order processor behavior based on atomic traces. In 2011 IEEE/IFIP 19th International Conference on VLSI and System-on-Chip (VLSI-SoC), pages IEEE, Oct [4] F.J. Mesa-Martinez and al., SCOORE: Santa Cruz out-of-order RISCengine, FPGA design issues, Proceeding of the Workshop on Architectural Rsearch Prototyping, pp , 2006 [5] S. Berezin, A. Biere, E. Clarke, andy. Zhu, Combining symbolic model checking with uninterpreted functions for out-of-order processor verification, in FMCAD 98, Lecture Notes in Computer Science, Vol. 1522,, pp Springer-Verlag, Berlin, [6] A. Biere, A. Cimatti, E.M. Clarke, and Y. Zhu, Symbolic model checking without BDDs, in TACAS 99, Lecture Notes in Computer Science, Vol. 1579, Springer-Verlag, Amsterdam, The Netherlands, [7] Tomasulo, R. M. An efficient algorithm for exploting multiple arithmetic units, IBM J. Research and Development 11:1, pp , January [8] W. Damm and A. Pnueli. Verifying out-of-order executions. In D. Probst, editor, CHARME 97. Chapman & Hall, [9] D. Sima, B. Polytech, The design space of register renaming techniques, Journal of Micro, IEEE, 20(5), pp , 2000 [10] S. Palacharla and al., Complexity-Effective Superscalar Processors Proceedings of the International Symposium on Computer Architecture, pp , [11] F.J. Mesa-Martinez and al., SCOORE: Santa Cruz out-of-order RISC engine, FPGA design issues, Proceeding of the Workshop on Architectural Rsearch Prototyping, pp , [12] K. Aasaraai and A. Moshovos, Towards a viable out-of-order soft core: Copy-free, checkpointed register renaming, Proceedings the Field-Programmable Logic and Applications, pp , [13] W. Damm and A. Pnueli, Verifying out-of-order executions, in D. Probst (Ed.), CHARME 97, Chapman & Hall, London, [14] L. Gwennap, Intel s P6 uses decoupled superscalar design, Microprocessor Report, Vol. 9, No. 2, pp. 9 15,1995. [15] J. Hennessy and D. Patterson, Computer Architecture: A Quantitative Approach, Morgan Kaufmann, SanMateo, CA, [16] Peter Dell. Die Auswirkung von Mechanismenzur out-of-order Ausf uhrung auf den Cyclecount von RISC- Architekturen. Master s thesis, Universit at des Saarlandes, FB. Informatik, Bhavana P.Shrivastava
Enhancing Performance in Multiple Execution Unit Architecture using Tomasulo Algorithm
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology ISSN 2320 088X IMPACT FACTOR: 6.017 IJCSMC,
More informationPerformance Evolution of 16 Bit Processor in FPGA using State Encoding Techniques
Performance Evolution of 16 Bit Processor in FPGA using State Encoding Techniques Madhavi Anupoju 1, M. Sunil Prakash 2 1 M.Tech (VLSI) Student, Department of Electronics & Communication Engineering, MVGR
More informationVHDL Design and Implementation of FPGA Based Logic Analyzer: Work in Progress
VHDL Design and Implementation of FPGA Based Logic Analyzer: Work in Progress Nor Zaidi Haron Ayer Keroh +606-5552086 zaidi@utem.edu.my Masrullizam Mat Ibrahim Ayer Keroh +606-5552081 masrullizam@utem.edu.my
More informationClock Gating Aware Low Power ALU Design and Implementation on FPGA
Clock Gating Aware Low ALU Design and Implementation on FPGA Bishwajeet Pandey and Manisha Pattanaik Abstract This paper deals with the design and implementation of a Clock Gating Aware Low Arithmetic
More informationDYNAMIC INSTRUCTION SCHEDULING WITH TOMASULO
DYNAMIC INSTRUCTION SCHEDULING WITH TOMASULO Slides by: Pedro Tomás Additional reading: Computer Architecture: A Quantitative Approach, 5th edition, Chapter 3, John L. Hennessy and David A. Patterson,
More informationAdvanced Pipelining and Instruction-Level Paralelism (2)
Advanced Pipelining and Instruction-Level Paralelism (2) Riferimenti bibliografici Computer architecture, a quantitative approach, Hennessy & Patterson: (Morgan Kaufmann eds.) Tomasulo s Algorithm For
More informationOutline. 1 Reiteration. 2 Dynamic scheduling - Tomasulo. 3 Superscalar, VLIW. 4 Speculation. 5 ILP limitations. 6 What we have done so far.
Outline 1 Reiteration Lecture 5: EIT090 Computer Architecture 2 Dynamic scheduling - Tomasulo Anders Ardö 3 Superscalar, VLIW EIT Electrical and Information Technology, Lund University Sept. 30, 2009 4
More informationInternational Journal of Scientific & Engineering Research, Volume 5, Issue 9, September ISSN
International Journal of Scientific & Engineering Research, Volume 5, Issue 9, September-2014 917 The Power Optimization of Linear Feedback Shift Register Using Fault Coverage Circuits K.YARRAYYA1, K CHITAMBARA
More informationLUT Optimization for Memory Based Computation using Modified OMS Technique
LUT Optimization for Memory Based Computation using Modified OMS Technique Indrajit Shankar Acharya & Ruhan Bevi Dept. of ECE, SRM University, Chennai, India E-mail : indrajitac123@gmail.com, ruhanmady@yahoo.co.in
More informationOF AN ADVANCED LUT METHODOLOGY BASED FIR FILTER DESIGN PROCESS
IMPLEMENTATION OF AN ADVANCED LUT METHODOLOGY BASED FIR FILTER DESIGN PROCESS 1 G. Sowmya Bala 2 A. Rama Krishna 1 PG student, Dept. of ECM. K.L.University, Vaddeswaram, A.P, India, 2 Assistant Professor,
More informationFPGA Based Implementation of Convolutional Encoder- Viterbi Decoder Using Multiple Booting Technique
FPGA Based Implementation of Convolutional Encoder- Viterbi Decoder Using Multiple Booting Technique Dr. Dhafir A. Alneema (1) Yahya Taher Qassim (2) Lecturer Assistant Lecturer Computer Engineering Dept.
More informationOut-of-Order Execution
1 Out-of-Order Execution Several implementations out-of-order completion CDC 6600 with scoreboarding IBM 360/91 with Tomasulo s algorithm & reservation stations out-of-order completion leads to: imprecise
More informationImplementation of UART with BIST Technique
Implementation of UART with BIST Technique Mr.S.N.Shettennavar 1, Mr.B.N.Sachidanand 2, Mr.D.K.Gupta 3, Mr.V.M.Metigoudar 4 1, 2, 3,4Assistant Professor, Dept. of Electronics Engineering, DKTE s Textile
More informationFaculty of Electrical & Electronics Engineering BEE3233 Electronics System Design. Laboratory 3: Finite State Machine (FSM)
Faculty of Electrical & Electronics Engineering BEE3233 Electronics System Design Laboratory 3: Finite State Machine (FSM) Mapping CO, PO, Domain, KI : CO2,PO3,P5,CTPS5 CO2: Construct logic circuit using
More informationDesign of Memory Based Implementation Using LUT Multiplier
Design of Memory Based Implementation Using LUT Multiplier Charan Kumar.k 1, S. Vikrama Narasimha Reddy 2, Neelima Koppala 3 1,2 M.Tech(VLSI) Student, 3 Assistant Professor, ECE Department, Sree Vidyanikethan
More informationPROCESSOR BASED TIMING SIGNAL GENERATOR FOR RADAR AND SENSOR APPLICATIONS
PROCESSOR BASED TIMING SIGNAL GENERATOR FOR RADAR AND SENSOR APPLICATIONS Application Note ABSTRACT... 3 KEYWORDS... 3 I. INTRODUCTION... 4 II. TIMING SIGNALS USAGE AND APPLICATION... 5 III. FEATURES AND
More informationFPGA Implementation of DA Algritm for Fir Filter
International Journal of Computational Engineering Research Vol, 03 Issue, 8 FPGA Implementation of DA Algritm for Fir Filter 1, Solmanraju Putta, 2, J Kishore, 3, P. Suresh 1, M.Tech student,assoc. Prof.,Professor
More informationINTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY
Tarannum Pathan,, 2013; Volume 1(8):655-662 INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK VLSI IMPLEMENTATION OF 8, 16 AND 32
More informationDesign and Implementation of Partial Reconfigurable Fir Filter Using Distributed Arithmetic Architecture
Design and Implementation of Partial Reconfigurable Fir Filter Using Distributed Arithmetic Architecture Vinaykumar Bagali 1, Deepika S Karishankari 2 1 Asst Prof, Electrical and Electronics Dept, BLDEA
More informationSlide Set 8. for ENCM 501 in Winter Term, Steve Norman, PhD, PEng
Slide Set 8 for ENCM 501 in Winter Term, 2017 Steve Norman, PhD, PEng Electrical & Computer Engineering Schulich School of Engineering University of Calgary Winter Term, 2017 ENCM 501 W17 Lectures: Slide
More informationReconfigurable FPGA Implementation of FIR Filter using Modified DA Method
Reconfigurable FPGA Implementation of FIR Filter using Modified DA Method M. Backia Lakshmi 1, D. Sellathambi 2 1 PG Student, Department of Electronics and Communication Engineering, Parisutham Institute
More informationImplementation of Low Power and Area Efficient Carry Select Adder
International Journal of Engineering Science Invention ISSN (Online): 2319 6734, ISSN (Print): 2319 6726 Volume 3 Issue 8 ǁ August 2014 ǁ PP.36-48 Implementation of Low Power and Area Efficient Carry Select
More informationLow Power Approach of Clock Gating in Synchronous System like FIFO: A Novel Clock Gating Approach and Comparative Analysis
Low Power Approach of Clock Gating in Synchronous System like FIFO: A Novel Clock Gating Approach and Comparative Analysis Abstract- A new technique of clock is presented to reduce dynamic power consumption.
More informationCOE328 Course Outline. Fall 2007
COE28 Course Outline Fall 2007 1 Objectives This course covers the basics of digital logic circuits and design. Through the basic understanding of Boolean algebra and number systems it introduces the student
More informationAbhijeetKhandale. H R Bhagyalakshmi
Sobel Edge Detection Using FPGA AbhijeetKhandale M.Tech Student Dept. of ECE BMS College of Engineering, Bangalore INDIA abhijeet.khandale@gmail.com H R Bhagyalakshmi Associate professor Dept. of ECE BMS
More informationOptimization of memory based multiplication for LUT
Optimization of memory based multiplication for LUT V. Hari Krishna *, N.C Pant ** * Guru Nanak Institute of Technology, E.C.E Dept., Hyderabad, India ** Guru Nanak Institute of Technology, Prof & Head,
More informationInvestigation of Look-Up Table Based FPGAs Using Various IDCT Architectures
Investigation of Look-Up Table Based FPGAs Using Various IDCT Architectures Jörn Gause Abstract This paper presents an investigation of Look-Up Table (LUT) based Field Programmable Gate Arrays (FPGAs)
More informationLow Power VLSI Circuits and Systems Prof. Ajit Pal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur
Low Power VLSI Circuits and Systems Prof. Ajit Pal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture No. # 29 Minimizing Switched Capacitance-III. (Refer
More informationTomasulo Algorithm. Developed at IBM and first implemented in IBM s 360/91
Tomasulo Algorithm Developed at IBM and first implemented in IBM s 360/91 IBM wanted to use the existing compiler instead of a specialized compiler for high end machines. Tracks when operands are available
More informationImplementation of Dynamic RAMs with clock gating circuits using Verilog HDL
Implementation of Dynamic RAMs with clock gating circuits using Verilog HDL B.Sanjay 1 SK.M.Javid 2 K.V.VenkateswaraRao 3 Asst.Professor B.E Student B.E Student SRKR Engg. College SRKR Engg. College SRKR
More informationL11/12: Reconfigurable Logic Architectures
L11/12: Reconfigurable Logic Architectures Acknowledgements: Materials in this lecture are courtesy of the following people and used with permission. - Randy H. Katz (University of California, Berkeley,
More informationA Compact and Fast FPGA Based Implementation of Encoding and Decoding Algorithm Using Reed Solomon Codes
A Compact and Fast FPGA Based Implementation of Encoding and Decoding Algorithm Using Reed Solomon Codes Aqib Al Azad and Md Imam Shahed Abstract This paper presents a compact and fast Field Programmable
More informationAvailable online at ScienceDirect. Procedia Computer Science 46 (2015 ) Aida S Tharakan a *, Binu K Mathew b
Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 46 (2015 ) 1409 1416 International Conference on Information and Communication Technologies (ICICT 2014) Design and Implementation
More informationHardware Modeling of Binary Coded Decimal Adder in Field Programmable Gate Array
American Journal of Applied Sciences 10 (5): 466-477, 2013 ISSN: 1546-9239 2013 M.I. Ibrahimy et al., This open access article is distributed under a Creative Commons Attribution (CC-BY) 3.0 license doi:10.3844/ajassp.2013.466.477
More informationL12: Reconfigurable Logic Architectures
L12: Reconfigurable Logic Architectures Acknowledgements: Materials in this lecture are courtesy of the following sources and are used with permission. Frank Honore Prof. Randy Katz (Unified Microelectronics
More informationAn Efficient 64-Bit Carry Select Adder With Less Delay And Reduced Area Application
An Efficient 64-Bit Carry Select Adder With Less Delay And Reduced Area Application K Allipeera, M.Tech Student & S Ahmed Basha, Assitant Professor Department of Electronics & Communication Engineering
More informationFPGA Hardware Resource Specific Optimal Design for FIR Filters
International Journal of Computer Engineering and Information Technology VOL. 8, NO. 11, November 2016, 203 207 Available online at: www.ijceit.org E-ISSN 2412-8856 (Online) FPGA Hardware Resource Specific
More informationComputer Architecture Spring 2016
Computer Architecture Spring 2016 Lecture 12: Dynamic Scheduling: Tomasulo s Algorithm Shuai Wang Department of Computer Science and Technology Nanjing University [Slides adapted from CS252, UC Berkeley
More informationLUT OPTIMIZATION USING COMBINED APC-OMS TECHNIQUE
LUT OPTIMIZATION USING COMBINED APC-OMS TECHNIQUE S.Basi Reddy* 1, K.Sreenivasa Rao 2 1 M.Tech Student, VLSI System Design, Annamacharya Institute of Technology & Sciences (Autonomous), Rajampet (A.P),
More informationA Fast Constant Coefficient Multiplier for the XC6200
A Fast Constant Coefficient Multiplier for the XC6200 Tom Kean, Bernie New and Bob Slous Xilinx Inc. Abstract. We discuss the design of a high performance constant coefficient multiplier on the Xilinx
More informationLecture 0: Organization
581365 Tietokoneen rakenne Computer Organization II Spring 2010 Tiina Niklander Matemaattis-luonnontieteellinen tiedekunta Computer Organization II Advanced (master) level course! Prerequisite: Computer
More informationDesign and Implementation of High Speed 256-Bit Modified Square Root Carry Select Adder
Design and Implementation of High Speed 256-Bit Modified Square Root Carry Select Adder Muralidharan.R [1], Jodhi Mohana Monica [2], Meenakshi.R [3], Lokeshwaran.R [4] B.Tech Student, Department of Electronics
More informationISSN:
427 AN EFFICIENT 64-BIT CARRY SELECT ADDER WITH REDUCED AREA APPLICATION CH PALLAVI 1, VSWATHI 2 1 II MTech, Chadalawada Ramanamma Engg College, Tirupati 2 Assistant Professor, DeptofECE, CREC, Tirupati
More informationWhy FPGAs? FPGA Overview. Why FPGAs?
Transistor-level Logic Circuits Positive Level-sensitive EECS150 - Digital Design Lecture 3 - Field Programmable Gate Arrays (FPGAs) January 28, 2003 John Wawrzynek Transistor Level clk clk clk Positive
More information[Krishna*, 4.(12): December, 2015] ISSN: (I2OR), Publication Impact Factor: 3.785
IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY DESIGN AND IMPLEMENTATION OF BIST TECHNIQUE IN UART SERIAL COMMUNICATION M.Hari Krishna*, P.Pavan Kumar * Electronics and Communication
More informationKeywords Xilinx ISE, LUT, FIR System, SDR, Spectrum- Sensing, FPGA, Memory- optimization, A-OMS LUT.
An Advanced and Area Optimized L.U.T Design using A.P.C. and O.M.S K.Sreelakshmi, A.Srinivasa Rao Department of Electronics and Communication Engineering Nimra College of Engineering and Technology Krishna
More informationModeling Digital Systems with Verilog
Modeling Digital Systems with Verilog Prof. Chien-Nan Liu TEL: 03-4227151 ext:34534 Email: jimmy@ee.ncu.edu.tw 6-1 Composition of Digital Systems Most digital systems can be partitioned into two types
More informationAn Efficient Reduction of Area in Multistandard Transform Core
An Efficient Reduction of Area in Multistandard Transform Core A. Shanmuga Priya 1, Dr. T. K. Shanthi 2 1 PG scholar, Applied Electronics, Department of ECE, 2 Assosiate Professor, Department of ECE Thanthai
More informationFPGA-BASED EDUCATIONAL LAB PLATFORM
FPGA-BASED EDUCATIONAL LAB PLATFORM Mircea Alexandru DABÂCAN, Clint COLE Mircea Dabâcan is with Technical University of Cluj-Napoca, Electronics and Telecommunications Faculty, Applied Electronics Department,
More informationDigital Systems Design
ECOM 4311 Digital Systems Design Eng. Monther Abusultan Computer Engineering Dept. Islamic University of Gaza Page 1 ECOM4311 Digital Systems Design Module #2 Agenda 1. History of Digital Design Approach
More informationSlide Set 9. for ENCM 501 in Winter Steve Norman, PhD, PEng
Slide Set 9 for ENCM 501 in Winter 2018 Steve Norman, PhD, PEng Electrical & Computer Engineering Schulich School of Engineering University of Calgary March 2018 ENCM 501 Winter 2018 Slide Set 9 slide
More informationDesign of Low Power Efficient Viterbi Decoder
International Journal of Research Studies in Electrical and Electronics Engineering (IJRSEEE) Volume 2, Issue 2, 2016, PP 1-7 ISSN 2454-9436 (Online) DOI: http://dx.doi.org/10.20431/2454-9436.0202001 www.arcjournals.org
More informationImplementation and Analysis of Area Efficient Architectures for CSLA by using CLA
Volume-6, Issue-3, May-June 2016 International Journal of Engineering and Management Research Page Number: 753-757 Implementation and Analysis of Area Efficient Architectures for CSLA by using CLA Anshu
More informationInside Digital Design Accompany Lab Manual
1 Inside Digital Design, Accompany Lab Manual Inside Digital Design Accompany Lab Manual Simulation Prototyping Synthesis and Post Synthesis Name- Roll Number- Total/Obtained Marks- Instructor Signature-
More informationA Modified Design of Test Pattern Generator for Built-In-Self- Test Applications
RESEARCH ARTICLE OPEN ACCESS A Modified Design of Test Pattern Generator for Built-In-Self- Test Applications Bharti Mishra*, Dr. Rita Jain** *(Department of Electronics and Communication Engineering,
More informationLecture 16: Instruction Level Parallelism -- Dynamic Scheduling (OOO) via Tomasulo s Approach
Lecture 16: Instruction Level Parallelism -- Dynamic Scheduling (OOO) via Tomasulo s Approach CSE 564 Computer Architecture Summer 2017 Department of Computer Science and Engineering Yonghong Yan yan@oakland.edu
More informationScoreboard Limitations
Scoreboard Limitations! No forwarding read from register! Structural hazards stall at issue! WAW hazard stall at issue! WAR hazard stall at write Inf3 Computer Architecture - 2016-2017 1 Dynamic Scheduling
More informationFPGA Design with VHDL
FPGA Design with VHDL Justus-Liebig-Universität Gießen, II. Physikalisches Institut Ming Liu Dr. Sören Lange Prof. Dr. Wolfgang Kühn ming.liu@physik.uni-giessen.de Lecture Digital design basics Basic logic
More informationFPGA Implementation of Low Power Self Testable MIPS Processor
American-Eurasian Journal of Scientific Research 12 (3): 135-144, 2017 ISSN 1818-6785 IDOSI Publications, 2017 DOI: 10.5829/idosi.aejsr.2017.135.144 FPGA Implementation of Low Power Self Testable MIPS
More information2.6 Reset Design Strategy
2.6 Reset esign Strategy Many design issues must be considered before choosing a reset strategy for an ASIC design, such as whether to use synchronous or asynchronous resets, will every flipflop receive
More information[Dharani*, 4.(8): August, 2015] ISSN: (I2OR), Publication Impact Factor: 3.785
IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY IMPLEMENTATION OF ADDRESS GENERATOR FOR WiMAX DEINTERLEAVER ON FPGA T. Dharani*, C.Manikanta * M. Tech scholar in VLSI System
More informationENGG2410: Digital Design Lab 5: Modular Designs and Hierarchy Using VHDL
ENGG2410: Digital Design Lab 5: Modular Designs and Hierarchy Using VHDL School of Engineering, University of Guelph Fall 2017 1 Objectives: Start Date: Week #7 2017 Report Due Date: Week #8 2017, in the
More informationAuthentic Time Hardware Co-simulation of Edge Discovery for Video Processing System
Authentic Time Hardware Co-simulation of Edge Discovery for Video Processing System R. NARESH M. Tech Scholar, Dept. of ECE R. SHIVAJI Assistant Professor, Dept. of ECE PRAKASH J. PATIL Head of Dept.ECE,
More informationInstruction Level Parallelism and Its. (Part II) ECE 154B
Instruction Level Parallelism and Its Exploitation (Part II) ECE 154B Dmitri Strukov ILP techniques not covered last week this week next week Scoreboard Technique Review Allow for out of order execution
More informationImproved 32 bit carry select adder for low area and low power
Journal From the SelectedWorks of Journal October, 2014 Improved 32 bit carry select adder for low area and low power Syed Javeed Chanukya Rani Imthiazunnisa Begum Korani Ravinder This work is licensed
More informationCS152 Computer Architecture and Engineering Lecture 17 Advanced Pipelining: Tomasulo Algorithm
CS152 Computer Architecture and Engineering Lecture 17 Advanced Pipelining: Tomasulo Algorithm 2003-10-23 Dave Patterson (www.cs.berkeley.edu/~patterson) www-inst.eecs.berkeley.edu/~cs152/ CS 152 L17 Adv.
More informationLFSRs as Functional Blocks in Wireless Applications Author: Stephen Lim and Andy Miller
XAPP22 (v.) January, 2 R Application Note: Virtex Series, Virtex-II Series and Spartan-II family LFSRs as Functional Blocks in Wireless Applications Author: Stephen Lim and Andy Miller Summary Linear Feedback
More informationInstruction Level Parallelism Part III
Course on: Advanced Computer Architectures Instruction Level Parallelism Part III Prof. Cristina Silvano Politecnico di Milano email: cristina.silvano@polimi.it 1 Outline of Part III Dynamic Scheduling
More informationFPGA Development for Radar, Radio-Astronomy and Communications
John-Philip Taylor Room 7.03, Department of Electrical Engineering, Menzies Building, University of Cape Town Cape Town, South Africa 7701 Tel: +27 82 354 6741 email: tyljoh010@myuct.ac.za Internet: http://www.uct.ac.za
More informationScoreboard Limitations!
Scoreboard Limitations! No forwarding read from register! Structural hazards stall at issue! WAW hazard stall at issue!! WAR hazard stall at write! Inf3 Computer Architecture - 2015-2016 1 Dynamic Scheduling
More informationInstruction Level Parallelism Part III
Course on: Advanced Computer Architectures Instruction Level Parallelism Part III Prof. Cristina Silvano Politecnico di Milano email: cristina.silvano@polimi.it 1 Outline of Part III Tomasulo Dynamic Scheduling
More informationComparative Analysis of Stein s. and Euclid s Algorithm with BIST for GCD Computations. 1. Introduction
IJCSN International Journal of Computer Science and Network, Vol 2, Issue 1, 2013 97 Comparative Analysis of Stein s and Euclid s Algorithm with BIST for GCD Computations 1 Sachin D.Kohale, 2 Ratnaprabha
More informationMemory efficient Distributed architecture LUT Design using Unified Architecture
Research Article Memory efficient Distributed architecture LUT Design using Unified Architecture Authors: 1 S.M.L.V.K. Durga, 2 N.S. Govind. Address for Correspondence: 1 M.Tech II Year, ECE Dept., ASR
More informationMicroprocessor Design
Microprocessor Design Principles and Practices With VHDL Enoch O. Hwang Brooks / Cole 2004 To my wife and children Windy, Jonathan and Michelle Contents 1. Designing a Microprocessor... 2 1.1 Overview
More informationFPGA Implementation of Viterbi Decoder
Proceedings of the 6th WSEAS Int. Conf. on Electronics, Hardware, Wireless and Optical Communications, Corfu Island, Greece, February 16-19, 2007 162 FPGA Implementation of Viterbi Decoder HEMA.S, SURESH
More informationThe Design of Efficient Viterbi Decoder and Realization by FPGA
Modern Applied Science; Vol. 6, No. 11; 212 ISSN 1913-1844 E-ISSN 1913-1852 Published by Canadian Center of Science and Education The Design of Efficient Viterbi Decoder and Realization by FPGA Liu Yanyan
More informationFurther Details Contact: A. Vinay , , #301, 303 & 304,3rdFloor, AVR Buildings, Opp to SV Music College, Balaji
S.NO 2018-2019 B.TECH VLSI IEEE TITLES TITLES FRONTEND 1. Approximate Quaternary Addition with the Fast Carry Chains of FPGAs 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. A Low-Power
More informationModified128 bit CSLA For Effective Area and Speed
Modified128 bit CSLA For Effective Area and Speed Shaik Bademia Babu, Sada.Ravindar,M.Tech,VLSI, Assistant professor Nimra Inst Of Sci and tech college, jupudi, Ibrahimpatnam,Vijayawada,AP state,india
More informationDesign on CIC interpolator in Model Simulator
Design on CIC interpolator in Model Simulator Manjunathachari k.b 1, Divya Prabha 2, Dr. M Z Kurian 3 M.Tech [VLSI], Sri Siddhartha Institute of Technology, Tumkur, Karnataka, India 1 Asst. Professor,
More informationSharif University of Technology. SoC: Introduction
SoC Design Lecture 1: Introduction Shaahin Hessabi Department of Computer Engineering System-on-Chip System: a set of related parts that act as a whole to achieve a given goal. A system is a set of interacting
More informationDesign & Simulation of 128x Interpolator Filter
Design & Simulation of 128x Interpolator Filter Rahul Sinha 1, Sonika 2 1 Dept. of Electronics & Telecommunication, CSIT, DURG, CG, INDIA rsinha.vlsieng@gmail.com 2 Dept. of Information Technology, CSIT,
More informationFully Pipelined High Speed SB and MC of AES Based on FPGA
Fully Pipelined High Speed SB and MC of AES Based on FPGA S.Sankar Ganesh #1, J.Jean Jenifer Nesam 2 1 Assistant.Professor,VIT University Tamil Nadu,India. 1 s.sankarganesh@vit.ac.in 2 jeanjenifer@rediffmail.com
More informationLogic Devices for Interfacing, The 8085 MPU Lecture 4
Logic Devices for Interfacing, The 8085 MPU Lecture 4 1 Logic Devices for Interfacing Tri-State devices Buffer Bidirectional Buffer Decoder Encoder D Flip Flop :Latch and Clocked 2 Tri-state Logic Outputs
More informationAmdahl s Law in the Multicore Era
Amdahl s Law in the Multicore Era Mark D. Hill and Michael R. Marty University of Wisconsin Madison August 2008 @ Semiahmoo Workshop IBM s Dr. Thomas Puzak: Everyone knows Amdahl s Law 2008 Multifacet
More informationDesign of BIST with Low Power Test Pattern Generator
IOSR Journal of VLSI and Signal Processing (IOSR-JVSP) Volume 4, Issue 5, Ver. II (Sep-Oct. 2014), PP 30-39 e-issn: 2319 4200, p-issn No. : 2319 4197 Design of BIST with Low Power Test Pattern Generator
More informationFPGA Implementaion of Soft Decision Viterbi Decoder
FPGA Implementaion of Soft Decision Viterbi Decoder Sahar F. Abdelmomen A. I. Taman Hatem M. Zakaria Mahmud F. M. Abstract This paper presents an implementation of a 3-bit soft decision Viterbi decoder.
More informationA video signal processor for motioncompensated field-rate upconversion in consumer television
A video signal processor for motioncompensated field-rate upconversion in consumer television B. De Loore, P. Lippens, P. Eeckhout, H. Huijgen, A. Löning, B. McSweeney, M. Verstraelen, B. Pham, G. de Haan,
More informationDesign and Implementation of Encoder for (15, k) Binary BCH Code Using VHDL
Design and Implementation of Encoder for (15, k) Binary BCH Code Using VHDL K. Rajani *, C. Raju ** *M.Tech, Department of ECE, G. Pullaiah College of Engineering and Technology, Kurnool **Assistant Professor,
More informationModeling Latches and Flip-flops
Lab Workbook Introduction Sequential circuits are digital circuits in which the output depends not only on the present input (like combinatorial circuits), but also on the past sequence of inputs. In effect,
More informationSyed Muhammad Yasser Sherazi CURRICULUM VITAE
Syed Muhammad Yasser Sherazi Date of Birth: 16th July 1982 Adress: Rydvagen 104A, 58431 Linköping, Sweden Cell: 0046762323697 E-post: smy_sherazi@yahoo.com Objective CURRICULUM VITAE To obtain a position
More informationSOC Implementation for Christmas Lighting with Pattern Display Indication RAMANDEEP SINGH 1, AKANKSHA SHARMA 2, ANKUR AGGARWAL 3, ANKIT SATIJA 4 1
1016 SOC Implementation for Christmas Lighting with Pattern Display Indication RAMANDEEP SINGH 1, AKANKSHA SHARMA 2, ANKUR AGGARWAL 3, ANKIT SATIJA 4 1 Assistant Professor, Department of EECE, ITM University,
More informationECE 270 Lab Verification / Evaluation Form. Experiment 9
ECE 270 Lab Verification / Evaluation Form Experiment 9 Evaluation: IMPORTANT! You must complete this experiment during your scheduled lab period. All work for this experiment must be demonstrated to and
More informationChapter 3 Instruction-Level Parallelism and its Exploitation (Part 1)
Chapter 3 Instruction-Level Parallelism and its Exploitation (Part 1) ILP vs. Parallel Computers Dynamic Scheduling (Section 3.4, 3.5) Dynamic Branch Prediction (Section 3.3) Hardware Speculation and Precise
More informationEfficient Method for Look-Up-Table Design in Memory Based Fir Filters
International Journal of Computer Applications (975 8887) Volume 78 No.6, September Efficient Method for Look-Up-Table Design in Memory Based Fir Filters Md.Zameeruddin M.Tech, DECS, Dept. of ECE, Vardhaman
More informationCOMP12111: Fundamentals of Computer Engineering
COMP2: Fundamentals of Computer Engineering Part I Course Overview & Introduction to Logic Paul Nutter Introduction What is this course about? Computer hardware design o not electronics nothing nasty like
More informationAn optimized implementation of 128 bit carry select adder using binary to excess-one converter for delay reduction and area efficiency
Journal From the SelectedWorks of Journal December, 2014 An optimized implementation of 128 bit carry select adder using binary to excess-one converter for delay reduction and area efficiency P. Manga
More informationVLSI Technology used in Auto-Scan Delay Testing Design For Bench Mark Circuits
VLSI Technology used in Auto-Scan Delay Testing Design For Bench Mark Circuits N.Brindha, A.Kaleel Rahuman ABSTRACT: Auto scan, a design for testability (DFT) technique for synchronous sequential circuits.
More informationDesign and FPGA Implementation of 100Gbit/s Scrambler Architectures for OTN Protocol Chethan Kumar M 1, Praveen Kumar Y G 2, Dr. M. Z. Kurian 3.
International Journal of Computer Engineering and Applications, Volume VI, Issue II, May 14 www.ijcea.com ISSN 2321 3469 Design and FPGA Implementation of 100Gbit/s Scrambler Architectures for OTN Protocol
More informationLUT Design Using OMS Technique for Memory Based Realization of FIR Filter
International Journal of Emerging Engineering Research and Technology Volume. 2, Issue 6, September 2014, PP 72-80 ISSN 2349-4395 (Print) & ISSN 2349-4409 (Online) LUT Design Using OMS Technique for Memory
More informationThis paper is a preprint of a paper accepted by Electronics Letters and is subject to Institution of Engineering and Technology Copyright.
This paper is a preprint of a paper accepted by Electronics Letters and is subject to Institution of Engineering and Technology Copyright. The final version is published and available at IET Digital Library
More information