Generalized Pattern Matching Micro-Engine
|
|
- Daniella Owen
- 5 years ago
- Views:
Transcription
1 Generalized Pattern Matching Micro-Engine Yuanwei Fang*, Raihan Rasool, Dilip Vasudevan*, Andrew A. Chien* University of Chicago * Argonne National Laboratory King Faisal University
2 Big Data Applications Deep Packet Inspection Bioinformatics (DNA Alignment) JSON/XML Parsing Signal Triggering 6/24/2014 UNIVERSITY OF CHICAGO 2
3 Deep Packet Inspection High speed network : 100Gb/s Growing number of patterns: 6000 Snort Rules Speed requirement: > 75 Tera DFAops/s Power budget : 200 W Energy efficiency requirement: > 375Gops/J 6/24/2014 UNIVERSITY OF CHICAGO 3
4 Bioinformatics (DNA Alignment) Bioinformatics database: millions of species Speed requirement: > 1 Tera DFAops/s Power budget : 200 W Energy efficiency requirement: > 5 Gops/J Genome size: 130G base pairs 6/24/2014 UNIVERSITY OF CHICAGO 4
5 Deterministic Finite Automata (DFA) 6/24/2014 UNIVERSITY OF CHICAGO 5
6 Programmable Approaches target Intel Xeon E5-2600: 17G DFAops/second with 130W, 0.13Gops/J ; 6/24/2014 UNIVERSITY OF CHICAGO 6
7 Approach Workload M input characters(m DFA transitions) N DFA rules perform on the M input characters Goal Compute N x M transitions efficiently Approach Parallelize DFA execution Fused Instruction 6/24/2014 UNIVERSITY OF CHICAGO 7
8 What Is Micro-Engine Generalized Pattern Matching Micro-Engine ( GenPM ) is one micro-engine of 10x10 approach Local Memory I-Cache Basic RISC CPU I-Cache Microengine 2 I-Cache Microengine 3 I-Cache Microengine 4 I-Cache GenPM I-Cache Microengine 6 I-Cache Microengine 7 I-Cache Microengine 8 Shared L1 Data Cache 6/24/2014 UNIVERSITY OF CHICAGO 8
9 GenPM Micro Architecture 6/24/2014 UNIVERSITY OF CHICAGO 9
10 Fused Instructions: Multi-Step String buffer a b c Acc_Vec 0 1 Current State A D Q 1 Q 4 ALU address Local Mem Accept ENB Next State 6/24/2014 UNIVERSITY OF CHICAGO 10
11 Fused Instructions: Multi-Step String buffer a b c Acc_Vec 0 1 Current State A D Q 1 Q 4 ALU address Local Mem Accept ENB Next State 6/24/2014 UNIVERSITY OF CHICAGO 11
12 Fused Instructions: Multi-Step String buffer a b c Acc_Vec Current State A D Q 1 Q 4 ALU address Local Mem Accept ENB Next State 6/24/2014 UNIVERSITY OF CHICAGO 12
13 Fused Instructions: Multi-Step String buffer a b c Acc_Vec Current State Acc_Vec A D ENB Q 1 Q 4 ALU address Local Mem Accept CHECK Next State 6/24/2014 UNIVERSITY OF CHICAGO 13
14 Parallel DFA: Vector Instruction SSE ADD /24/2014 UNIVERSITY OF CHICAGO 14
15 Parallel DFA: Vector Instruction GMVSNEXT DFAop DFAop DFAop DFAop DFAop DFAop DFAop 6/24/2014 UNIVERSITY OF CHICAGO 15
16 GenPM Code Example Data movement Multi-step parallel DFA execution Find precise matching position 6/24/2014 UNIVERSITY OF CHICAGO 16
17 Methodology Design space: Parallelism and step length Baseline 32-bit 6-stage in-order RISC 4GB DDR3 DRAM 32KB L1 I-cache, 24KB L1 D-cache, 512KB L2 (modeled on Intel Silverthorne) GenPM 1MB Local memory (up to 64 banks) Vector and Fused Instructions Performance/Power Model Core : 32nm synthesis by Synopsys Processor Designer Memories : MARSSX86/CACTI 6 + DRAMSim2 Workload 64 Snort rules from snapshot, 10KB random network dump 6/24/2014 UNIVERSITY OF CHICAGO 17
18 speedup versus RISC Performance GenPM_8way Speedup GenPM_64way step length 6/24/2014 UNIVERSITY OF CHICAGO 18
19 energy improvement versus RISC Energy Efficiency GenPM_8way GenPM_64way step length 6/24/2014 UNIVERSITY OF CHICAGO 19
20 Throughput per watt(gops/j) Throughput/watt (absolute) GenPM_8way Throughput/watt GenPM_64way step length Scale to a 75W chip, GenPM delivers > 2.6 Tera DFAops/second 6/24/2014 UNIVERSITY OF CHICAGO 20
21 total energy Energy Breakdown 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% RISC GenPM_8B_1S GenPM_8B_8S GenPM_8B_16S GenPM_64B_1S GenPM_64B_8S GenPM_64B_16S LM_max = 83% LM L1_I L1_D L2 DRAM Core 6/24/2014 UNIVERSITY OF CHICAGO 21
22 General Comparison 6/24/2014 UNIVERSITY OF CHICAGO 22
23 Related Work ASIC: [Brodie, et.al. ISCA 2006], [Titanic System RXP], [ Cisco SCE ] FPGA: [Yang Xu, et.al. ANCS 2011], [ T Song, et.al. INFOCOM 2008], [I Sourdis et.al. VLSI 2008] CPU: [Mytkowicz et.al. ASPLOS 2014 ], [ Intel HyperScan] GPU: [Vasiliadis G, et.al. CCS 2011], [ Lin CH, et.al. INFOCOM 2012] SoC: [C Johnson et.al. ISSCC 2010 ], [ Cavium Octeon ], [ IBM PowerEN ] 6/24/2014 UNIVERSITY OF CHICAGO 23
24 Summery GenPM is a high performance and energy efficient accelerator for pattern matching workloads ISA exploits parallelism and multi-step execution Scale to a 75W chip, GenPM delivers > 2.6 Tera DFAops/second GenPM approaches ASIC efficiency and integrates it into a programmable core 6/24/2014 UNIVERSITY OF CHICAGO 24
25 Future Work DFA table compression Scale up with multiple GenPM micro-engines Explore more applications 6/24/2014 UNIVERSITY OF CHICAGO 25
26 Acknowledgements Defense Advanced Research Projects Agency (DARPA) Agilent Technologies (now Keysight Technologies) Synopsys Academic program Dr. Tung Hoang and members of the Large Scale Systems Group in the Department of Computer Science 6/24/2014 UNIVERSITY OF CHICAGO 26
High Performance Microprocessor Design and Automation: Overview, Challenges and Opportunities IBM Corporation
High Performance Microprocessor Design and Automation: Overview, Challenges and Opportunities Introduction About Myself What to expect out of this lecture Understand the current trend in the IC Design
More informationA CYCLES/MB H.264/AVC MOTION COMPENSATION ARCHITECTURE FOR QUAD-HD APPLICATIONS
9th European Signal Processing Conference (EUSIPCO 2) Barcelona, Spain, August 29 - September 2, 2 A 6-65 CYCLES/MB H.264/AVC MOTION COMPENSATION ARCHITECTURE FOR QUAD-HD APPLICATIONS Jinjia Zhou, Dajiang
More informationEIE: Efficient Inference Engine on Compressed Deep Neural Network
EIE: Efficient Inference Engine on Compressed Deep Neural Network Song Han*, Xingyu Liu, Huizi Mao, Jing Pu, Ardavan Pedram, Mark Horowitz, Bill Dally Stanford University June 20, 2016 Deep Learning on
More informationScalability of MB-level Parallelism for H.264 Decoding
Scalability of Macroblock-level Parallelism for H.264 Decoding Mauricio Alvarez Mesa 1, Alex Ramírez 1,2, Mateo Valero 1,2, Arnaldo Azevedo 3, Cor Meenderinck 3, Ben Juurlink 3 1 Universitat Politècnica
More informationHarnessing the Four Horsemen of the Coming Dark Silicon Apocalypse
Dark Silicon Workshop Kick-off Talk Harnessing the Four Horsemen of the Coming Dark Silicon Apocalypse Michael B. Taylor Associate Professor (July 2012) University of California, San Diego This Talk The
More informationA Low-Power 0.7-V H p Video Decoder
A Low-Power 0.7-V H.264 720p Video Decoder D. Finchelstein, V. Sze, M.E. Sinangil, Y. Koken, A.P. Chandrakasan A-SSCC 2008 Outline Motivation for low-power video decoders Low-power techniques pipelining
More informationLossless Compression Algorithms for Direct- Write Lithography Systems
Lossless Compression Algorithms for Direct- Write Lithography Systems Hsin-I Liu Video and Image Processing Lab Department of Electrical Engineering and Computer Science University of California at Berkeley
More informationUsing Embedded Dynamic Random Access Memory to Reduce Energy Consumption of Magnetic Recording Read Channel
IEEE TRANSACTIONS ON MAGNETICS, VOL. 46, NO. 1, JANUARY 2010 87 Using Embedded Dynamic Random Access Memory to Reduce Energy Consumption of Magnetic Recording Read Channel Ningde Xie 1, Tong Zhang 1, and
More informationFPGA-BASED IMPLEMENTATION OF A REAL-TIME 5000-WORD CONTINUOUS SPEECH RECOGNIZER
FPGA-BASED IMPLEMENTATION OF A REAL-TIME 5000-WORD CONTINUOUS SPEECH RECOGNIZER Young-kyu Choi, Kisun You, and Wonyong Sung School of Electrical Engineering, Seoul National University San 56-1, Shillim-dong,
More informationHighly Parallel HEVC Decoding for Heterogeneous Systems with CPU and GPU
2017. This manuscript version (accecpted manuscript) is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/. Highly Parallel HEVC Decoding for Heterogeneous
More informationTime-Saving Features in Economy Oscilloscopes Streamline Test
Time-Saving Features in Economy Oscilloscopes Streamline Test Application Note Oscilloscopes are the go-to tool for debug and troubleshooting, whether you work in &, manufacturing or education. Like other
More informationFurther Details Contact: A. Vinay , , #301, 303 & 304,3rdFloor, AVR Buildings, Opp to SV Music College, Balaji
S.NO 2018-2019 B.TECH VLSI IEEE TITLES TITLES FRONTEND 1. Approximate Quaternary Addition with the Fast Carry Chains of FPGAs 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. A Low-Power
More information32 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 45, NO. 1, JANUARY 2010
32 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 45, NO. 1, JANUARY 2010 A 201.4 GOPS 496 mw Real-Time Multi-Object Recognition Processor With Bio-Inspired Neural Perception Engine Joo-Young Kim, Student
More informationOptimization of Multi-Channel BCH Error Decoding for Common Cases. Russell Dill Master's Thesis Defense April 20, 2015
Optimization of Multi-Channel BCH Error Decoding for Common Cases Russell Dill Master's Thesis Defense April 20, 2015 Bose-Chaudhuri-Hocquenghem (BCH) BCH is an Error Correcting Code (ECC) and is used
More informationA High-Performance Parallel CAVLC Encoder on a Fine-Grained Many-core System
A High-Performance Parallel CAVLC Encoder on a Fine-Grained Many-core System Zhibin Xiao and Bevan M. Baas VLSI Computation Lab, ECE Department University of California, Davis Outline Introduction to H.264
More informationLow Power Design: From Soup to Nuts. Tutorial Outline
Low Power Design: From Soup to Nuts Mary Jane Irwin and Vijay Narayanan Dept of CSE, Microsystems Design Lab Penn State University (www.cse.psu.edu/~mdl) ISCA Tutorial: Low Power Design Introduction.1
More informationMauricio Álvarez-Mesa ; Chi Ching Chi ; Ben Juurlink ; Valeri George ; Thomas Schierl Parallel video decoding in the emerging HEVC standard
Mauricio Álvarez-Mesa ; Chi Ching Chi ; Ben Juurlink ; Valeri George ; Thomas Schierl Parallel video decoding in the emerging HEVC standard Conference object, Postprint version This version is available
More informationDay 21: Retiming Requirements. ESE534: Computer Organization. Relative Sizes. Today. State. State Size
ESE534: Computer Organization Day 22: November 16, 2016 Retiming 1 Day 21: Retiming Requirements Retiming requirement depends on parallelism and performance Even with a given amount of parallelism Will
More informationESE (ESE534): Computer Organization. Last Time. Today. Last Time. Align Data / Balance Paths. Retiming in the Large
ESE680-002 (ESE534): Computer Organization Day 20: March 28, 2007 Retiming 2: Structures and Balance Last Time Saw how to formulate and automate retiming: start with network calculate minimum achievable
More informationMotion Compensation Hardware Accelerator Architecture for H.264/AVC
Motion Compensation Hardware Accelerator Architecture for H.264/AVC Bruno Zatt 1, Valter Ferreira 1, Luciano Agostini 2, Flávio R. Wagner 1, Altamiro Susin 3, and Sergio Bampi 1 1 Informatics Institute
More informationSoC IC Basics. COE838: Systems on Chip Design
SoC IC Basics COE838: Systems on Chip Design http://www.ee.ryerson.ca/~courses/coe838/ Dr. Gul N. Khan http://www.ee.ryerson.ca/~gnkhan Electrical and Computer Engineering Ryerson University Overview SoC
More informationA Design Of A Low Power Delay Buffer Using Ring Counter Addressing Schemes
A Design Of A Low Power Delay Buffer Using Ring Counter Addressing Schemes B.R.B Jaswanth St.Theresa Institute of Engineering and Technology Gudiwada, India Abstract This work presents circuit design of
More informationAN EFFECTIVE CACHE FOR THE ANYWHERE PIXEL ROUTER
University of Kentucky UKnowledge Theses and Dissertations--Electrical and Computer Engineering Electrical and Computer Engineering 2007 AN EFFECTIVE CACHE FOR THE ANYWHERE PIXEL ROUTER Vijai Raghunathan
More informationL12: Reconfigurable Logic Architectures
L12: Reconfigurable Logic Architectures Acknowledgements: Materials in this lecture are courtesy of the following sources and are used with permission. Frank Honore Prof. Randy Katz (Unified Microelectronics
More informationJun-Hao Zheng et al.: An Efficient VLSI Architecture for MC of AVS HDTV Decoder 371 ture for MC which contains a three-stage pipeline. The hardware ar
May 2006, Vol.21, No.3, pp.370 377 J. Comput. Sci. & Technol. An Efficient VLSI Architecture for Motion Compensation of AVS HDTV Decoder Jun-Hao Zheng 1;3 (ΨΞ ), Lei Deng 2 ( Π), Peng Zhang 1;3 (Φ ±),
More informationThe main design objective in adder design are area, speed and power. Carry Select Adder (CSLA) is one of the fastest
ISSN: 0975-766X CODEN: IJPTFI Available Online through Research Article www.ijptonline.com IMPLEMENTATION OF FAST SQUARE ROOT SELECT WITH LOW POWER CONSUMPTION V.Elanangai*, Dr. K.Vasanth Department of
More informationLow-Power Techniques for Video Decoding. Daniel Frederic Finchelstein
Low-Power Techniques for Video Decoding by Daniel Frederic Finchelstein Submitted to the Department of Electrical Engineering and Computer Science in partial fulfillment of the requirements for the degree
More informationAmdahl s Law in the Multicore Era
Amdahl s Law in the Multicore Era Mark D. Hill and Michael R. Marty University of Wisconsin Madison August 2008 @ Semiahmoo Workshop IBM s Dr. Thomas Puzak: Everyone knows Amdahl s Law 2008 Multifacet
More informationIEEE Santa Clara ComSoc/CAS Weekend Workshop Event-based analog sensing
IEEE Santa Clara ComSoc/CAS Weekend Workshop Event-based analog sensing Theodore Yu theodore.yu@ti.com Texas Instruments Kilby Labs, Silicon Valley Labs September 29, 2012 1 Living in an analog world The
More informationComputer and Machine Vision
Computer and Machine Vision Lecture Week 3 Part-1 January 27, 2014 Sam Siewert Outline of Week 3 Processing Images and Moving Pictures High Level View and Computer Architecture for it Linux Platforms for
More informationL11/12: Reconfigurable Logic Architectures
L11/12: Reconfigurable Logic Architectures Acknowledgements: Materials in this lecture are courtesy of the following people and used with permission. - Randy H. Katz (University of California, Berkeley,
More informationGPU Acceleration of a Production Molecular Docking Code
GPU Acceleration of a Production Molecular Docking Code Bharat Sukhwani Martin Herbordt Computer Architecture and Automated Design Laboratory Department of Electrical and Computer Engineering Boston University
More informationFPGA Implementation of Low Power Self Testable MIPS Processor
American-Eurasian Journal of Scientific Research 12 (3): 135-144, 2017 ISSN 1818-6785 IDOSI Publications, 2017 DOI: 10.5829/idosi.aejsr.2017.135.144 FPGA Implementation of Low Power Self Testable MIPS
More informationBubble Razor An Architecture-Independent Approach to Timing-Error Detection and Correction
1 Bubble Razor An Architecture-Independent Approach to Timing-Error Detection and Correction Matthew Fojtik, David Fick, Yejoong Kim, Nathaniel Pinckney, David Harris, David Blaauw, Dennis Sylvester mfojtik@umich.edu
More informationComp 410/510. Computer Graphics Spring Introduction to Graphics Systems
Comp 410/510 Computer Graphics Spring 2018 Introduction to Graphics Systems Computer Graphics Computer graphics deals with all aspects of 'creating images with a computer - Hardware (PC with graphics card)
More informationPRACE Autumn School GPU Programming
PRACE Autumn School 2010 GPU Programming October 25-29, 2010 PRACE Autumn School, Oct 2010 1 Outline GPU Programming Track Tuesday 26th GPGPU: General-purpose GPU Programming CUDA Architecture, Threading
More informationMethodology. Nitin Chawla,Harvinder Singh & Pascal Urard. STMicroelectronics
An Algorithm to Silicon ESL Design Methodology Nitin Chawla,Harvinder Singh & Pascal Urard STMicroelectronics SOC Design Challenges:Increased Complexity 992 994 996 998 2 22 24 26 28 2.7.5.35.25.8.3 9
More informationOL_H264MCLD Multi-Channel HDTV H.264/AVC Limited Baseline Video Decoder V1.0. General Description. Applications. Features
OL_H264MCLD Multi-Channel HDTV H.264/AVC Limited Baseline Video Decoder V1.0 General Description Applications Features The OL_H264MCLD core is a hardware implementation of the H.264 baseline video compression
More informationLow Power VLSI Circuits and Systems Prof. Ajit Pal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur
Low Power VLSI Circuits and Systems Prof. Ajit Pal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture No. # 29 Minimizing Switched Capacitance-III. (Refer
More informationAn FPGA Implementation of Shift Register Using Pulsed Latches
An FPGA Implementation of Shift Register Using Pulsed Latches Shiny Panimalar.S, T.Nisha Priscilla, Associate Professor, Department of ECE, MAMCET, Tiruchirappalli, India PG Scholar, Department of ECE,
More informationDesign Challenge of a QuadHDTV Video Decoder
Design Challenge of a QuadHDTV Video Decoder Youn-Long Lin Department of Computer Science National Tsing Hua University MPSOC27, Japan More Pixels YLLIN NTHU-CS 2 NHK Proposes UHD TV Broadcast Super HiVision
More informationParallelization of Multimedia Applications by Compiler on Multicores for Consumer Electronics
Vol. 0 No. 0 1959 TV MPEG2 MP3 JPEG 2000 OSCAR API VLIW 4 FR1000 SH-4A 4 RP1 FR1000 4 1 4 3.27 RP1 4 1 4 3.31 Parallelization of Multimedia Applications by Compiler on Multicores for Consumer Electronics
More informationThe Multistandard Full Hd Video-Codec Engine On Low Power Devices
The Multistandard Full Hd Video-Codec Engine On Low Power Devices B.Susma (M. Tech). Embedded Systems. Aurora s Technological & Research Institute. Hyderabad. B.Srinivas Asst. professor. ECE, Aurora s
More informationReconfigurable FPGA Implementation of FIR Filter using Modified DA Method
Reconfigurable FPGA Implementation of FIR Filter using Modified DA Method M. Backia Lakshmi 1, D. Sellathambi 2 1 PG Student, Department of Electronics and Communication Engineering, Parisutham Institute
More informationPower Efficient Architectures to Accelerate Deep Convolutional Neural Networks for edge computing and IoT
Power Efficient Architectures to Accelerate Deep Convolutional Neural Networks for edge computing and IoT Giuseppe Desoli ST Central Labs STMicroelectronics Artificial Intelligence is Everywhere 2 Analysis,
More informationObjectives. Combinational logics Sequential logics Finite state machine Arithmetic circuits Datapath
Objectives Combinational logics Sequential logics Finite state machine Arithmetic circuits Datapath In the previous chapters we have studied how to develop a specification from a given application, and
More informationUSING FUSION SYSTEM ARCHITECTURE FOR BROADCAST VIDEO. Edward Callway AMD
USING FUSION SYSTEM ARCHITECTURE FOR BROADCAST VIDEO Edward Callway AMD USING PC COMPONENTS FOR BROADCAST VIDEO Video processing from pure analog to digital compute PC Design for video Parallel GPU computing
More informationExploring Architecture Parameters for Dual-Output LUT based FPGAs
Exploring Architecture Parameters for Dual-Output LUT based FPGAs Zhenghong Jiang, Colin Yu Lin, Liqun Yang, Fei Wang and Haigang Yang System on Programmable Chip Research Department, Institute of Electronics,
More informationRedEye Analog ConvNet Image Sensor Architecture for Continuous Mobile Vision
Analog ConvNet Image Sensor Architecture for Continuous Mobile Vision Robert LiKamWa Yunhui Hou Yuan Gao Mia Polansky Lin Zhong roblkw@rice.edu houyh@rice.edu yg18@rice.edu mia.polansky@rice.edu lzhong@rice.edu
More informationESE534: Computer Organization. Today. Image Processing. Retiming Demand. Preclass 2. Preclass 2. Retiming Demand. Day 21: April 14, 2014 Retiming
ESE534: Computer Organization Today Retiming Demand Folded Computation Day 21: April 14, 2014 Retiming Logical Pipelining Physical Pipelining Retiming Supply Technology Structures Hierarchy 1 2 Image Processing
More informationVlsi Digital Signal Processing Systems Design And Implementation
Vlsi Digital Signal Processing Systems Design And Implementation We have made it easy for you to find a PDF Ebooks without any digging. And by having access to our ebooks online or by storing it on your
More informationFLIP-FLOPS and latches, which we collectively refer to as
1294 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 39, NO. 8, AUGUST 2004 A Test Circuit for Measurement of Clocked Storage Element Characteristics Nikola Nedovic, Member, IEEE, William W. Walker, Member,
More informationTODAY computer vision technologies are used with great
ARXIV PREPRINT 1 Origami: A 803 GOp/s/W Convolutional Network Accelerator Lukas Cavigelli, Student Member, IEEE, and Luca Benini, Fellow, IEEE arxiv:1512.04295v2 [cs.cv] 19 Jan 2016 Abstract An ever increasing
More informationInside Digital Design Accompany Lab Manual
1 Inside Digital Design, Accompany Lab Manual Inside Digital Design Accompany Lab Manual Simulation Prototyping Synthesis and Post Synthesis Name- Roll Number- Total/Obtained Marks- Instructor Signature-
More informationYong Cao, Debprakash Patnaik, Sean Ponce, Jeremy Archuleta, Patrick Butler, Wu-chun Feng, and Naren Ramakrishnan
Yong Cao, Debprakash Patnaik, Sean Ponce, Jeremy Archuleta, Patrick Butler, Wu-chun Feng, and Naren Ramakrishnan Virginia Polytechnic Institute and State University Reverse-engineer the brain National
More informationHybrid Discrete-Continuous Computer Architectures for Post-Moore s-law Era
Hybrid Discrete-Continuous Computer Architectures for Post-Moore s-law Era Keynote at the Bi annual HiPEAC Compu6ng Systems Week Mee6ng Barcelona, Spain October 19 th 2010 Prof. Simha Sethumadhavan Columbia
More informationScalable Lossless High Definition Image Coding on Multicore Platforms
Scalable Lossless High Definition Image Coding on Multicore Platforms Shih-Wei Liao 2, Shih-Hao Hung 2, Chia-Heng Tu 1, and Jen-Hao Chen 2 1 Graduate Institute of Networking and Multimedia 2 Department
More informationHow to Manage Video Frame- Processing Time Deviations in ASIC and SOC Video Processors
WHITE PAPER How to Manage Video Frame- Processing Time Deviations in ASIC and SOC Video Processors Some video frames take longer to process than others because of the nature of digital video compression.
More informationHardware Implementation of Block GC3 Lossless Compression Algorithm for Direct-Write Lithography Systems
Hardware Implementation of Block GC3 Lossless Compression Algorithm for Direct-Write Lithography Systems Hsin-I Liu, Brian Richards, Avideh Zakhor, and Borivoje Nikolic Dept. of Electrical Engineering
More informationTechnical Note PowerPC Embedded Processors Video Security with PowerPC
Introduction For many reasons, digital platforms are becoming increasingly popular for video security applications. In comparison to traditional analog support, a digital solution can more effectively
More informationALICE Week Technical Board TPC Intelligent Readout Architecture. Volker Lindenstruth Universität Heidelberg
ALICE Week 17.11.99 Technical Board TPC Intelligent Readout Architecture Volker Lindenstruth Universität Heidelberg What s new?? TPC occuppancy is much higher than originally assumed New Trigger Detector
More informationNutaq. PicoDigitizer-125. Up to 64 Channels, 125 MSPS ADCs, FPGA-based DAQ Solution With Up to 32 Channels, 1000 MSPS DACs PRODUCT SHEET. nutaq.
Nutaq Up to 64 Channels, 125 MSPS ADCs, FPGA-based DAQ Solution With Up to 32 Channels, 1000 MSPS DACs PRODUCT SHEET QUEBEC I MONTREAL I N E W YO R K I nutaq.com Nutaq The PicoDigitizer 125-Series is a
More informationTutorial Outline. Typical Memory Hierarchy
Tutorial Outline 8:30-8:45 8:45-9:05 9:05-9:30 9:30-10:30 10:30-10:50 10:50-12:15 12:15-1:30 1:30-2:30 2:30-3:30 3:30-3:50 3:50-4:30 4:30-4:45 Introduction and motivation Sources of power in CMOS designs
More informationSoC and SiP technology for digital consumer electronic systems
and SiP technology for digital consumer electronic systems Akira Matsuzawa Tokyo Institute of Technology 2005 06 20 A. Matsuzawa, Titech 1 Contents Digital consumer electronic systems and technology for
More informationDesign and analysis of microcontroller system using AMBA- Lite bus
Design and analysis of microcontroller system using AMBA- Lite bus Wang Hang Suan 1,*, and Asral Bahari Jambek 1 1 School of Microelectronic Engineering, Universiti Malaysia Perlis, Perlis, Malaysia Abstract.
More informationFPGA Based Implementation of Convolutional Encoder- Viterbi Decoder Using Multiple Booting Technique
FPGA Based Implementation of Convolutional Encoder- Viterbi Decoder Using Multiple Booting Technique Dr. Dhafir A. Alneema (1) Yahya Taher Qassim (2) Lecturer Assistant Lecturer Computer Engineering Dept.
More informationChi Ching Chi, Ben Juurlink A QHD-capable parallel H.264 decoder
Powered by TCPDF (www.tcpdf.org) Chi Ching Chi, Ben Juurlink A QHD-capable parallel H.264 decoder Conference Object, Postprint version This version is available at http://dx.doi.org/1.14279/depositonce-634
More informationLayout Decompression Chip for Maskless Lithography
Layout Decompression Chip for Maskless Lithography Borivoje Nikolić, Ben Wild, Vito Dai, Yashesh Shroff, Benjamin Warlick, Avideh Zakhor, William G. Oldham Department of Electrical Engineering and Computer
More informationIMPLEMENTATION OF X-FACTOR CIRCUITRY IN DECOMPRESSOR ARCHITECTURE
IMPLEMENTATION OF X-FACTOR CIRCUITRY IN DECOMPRESSOR ARCHITECTURE SATHISHKUMAR.K #1, SARAVANAN.S #2, VIJAYSAI. R #3 School of Computing, M.Tech VLSI design, SASTRA University Thanjavur, Tamil Nadu, 613401,
More informationResearch Article. Implementation of Low Power, Delay and Area Efficient Shifters for Memory Based Computation
International Journal of Modern Science and Technology Vol. 2, No. 5, 2017. Page 217-222. http://www.ijmst.co/ ISSN: 2456-0235. Research Article Implementation of Low Power, Delay and Area Efficient Shifters
More informationEnhancing Performance in Multiple Execution Unit Architecture using Tomasulo Algorithm
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology ISSN 2320 088X IMPACT FACTOR: 6.017 IJCSMC,
More informationClock Tree Power Optimization of Three Dimensional VLSI System with Network
Clock Tree Power Optimization of Three Dimensional VLSI System with Network M.Saranya 1, S.Mahalakshmi 2, P.Saranya Devi 3 PG Student, Dept. of ECE, Syed Ammal Engineering College, Ramanathapuram, Tamilnadu,
More informationMULTI-CORE SOFTWARE ARCHITECTURE FOR THE SCALABLE HEVC DECODER. Wassim Hamidouche, Mickael Raulet and Olivier Déforges
2014 IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP) MULTI-CORE SOFTWARE ARCHITECTURE FOR THE SCALABLE HEVC DECODER Wassim Hamidouche, Mickael Raulet and Olivier Déforges
More informationRetiming Sequential Circuits for Low Power
Retiming Sequential Circuits for Low Power José Monteiro, Srinivas Devadas Department of EECS MIT, Cambridge, MA Abhijit Ghosh Mitsubishi Electric Research Laboratories Sunnyvale, CA Abstract Switching
More informationLogic Analysis Basics
Logic Analysis Basics September 27, 2006 presented by: Alex Dickson Copyright 2003 Agilent Technologies, Inc. Introduction If you have ever asked yourself these questions: What is a logic analyzer? What
More informationTomasulo Algorithm Based Out of Order Execution Processor
Tomasulo Algorithm Based Out of Order Execution Processor Bhavana P.Shrivastava MAaulana Azad National Institute of Technology, Department of Electronics and Communication ABSTRACT In this research work,
More informationLogic Analysis Basics
Logic Analysis Basics September 27, 2006 presented by: Alex Dickson Copyright 2003 Agilent Technologies, Inc. Introduction If you have ever asked yourself these questions: What is a logic analyzer? What
More informationHigh-Efficient Parallel CAVLC Encoders on Heterogeneous Multicore Architectures
46 H. Y. SU, M. WEN, J. REN, N. WU, J. CHAI, C.Y. ZHANG, HIGH-EFFICIENT PARALLEL CAVLC ENCODER High-Efficient Parallel CAVLC Encoders on Heterogeneous Multicore Architectures Huayou SU, Mei WEN, Ju REN,
More informationCalifornia State University, Bakersfield Computer & Electrical Engineering & Computer Science ECE 3220: Digital Design with VHDL Laboratory 7
California State University, Bakersfield Computer & Electrical Engineering & Computer Science ECE 322: Digital Design with VHDL Laboratory 7 Rational: The purpose of this lab is to become familiar in using
More informationImpact of Intermittent Faults on Nanocomputing Devices
Impact of Intermittent Faults on Nanocomputing Devices Cristian Constantinescu June 28th, 2007 Dependable Systems and Networks Outline Fault classes Permanent faults Transient faults Intermittent faults
More informationPerformance Evolution of 16 Bit Processor in FPGA using State Encoding Techniques
Performance Evolution of 16 Bit Processor in FPGA using State Encoding Techniques Madhavi Anupoju 1, M. Sunil Prakash 2 1 M.Tech (VLSI) Student, Department of Electronics & Communication Engineering, MVGR
More informationBuilt-In Proactive Tuning System for Circuit Aging Resilience
IEEE International Symposium on Defect and Fault Tolerance of VLSI Systems Built-In Proactive Tuning System for Circuit Aging Resilience Nimay Shah 1, Rupak Samanta 1, Ming Zhang 2, Jiang Hu 1, Duncan
More informationTHE Collider Detector at Fermilab (CDF) [1] is a general
The Level-3 Trigger at the CDF Experiment at Tevatron Run II Y.S. Chung 1, G. De Lentdecker 1, S. Demers 1,B.Y.Han 1, B. Kilminster 1,J.Lee 1, K. McFarland 1, A. Vaiciulis 1, F. Azfar 2,T.Huffman 2,T.Akimoto
More informationLarge Area, High Speed Photo-detectors Readout
Large Area, High Speed Photo-detectors Readout Jean-Francois Genat + On behalf and with the help of Herve Grabas +, Samuel Meehan +, Eric Oberla +, Fukun Tang +, Gary Varner ++, and Henry Frisch + + University
More informationTiming EECS141 EE141. EE141-Fall 2011 Digital Integrated Circuits. Pipelining. Administrative Stuff. Last Lecture. Latch-Based Clocking.
EE141-Fall 2011 Digital Integrated Circuits Lecture 2 Clock, I/O Timing 1 4 Administrative Stuff Pipelining Project Phase 4 due on Monday, Nov. 21, 10am Homework 9 Due Thursday, December 1 Visit to Intel
More informationOptical clock distribution for a more efficient use of DRAMs
Optical clock distribution for a more efficient use of DRAMs D. Litaize, M.P.Y. Desmulliez*, J. Collet**, P. Foulk* Institut de Recherche en Informatique de Toulouse (IRIT), Universite Paul Sabatier, 31062
More informationNew Directions in Manufacturing Test
New Directions in Manufacturing Test Jacob A. Abraham Computer Engineering Research Center The University of Texas at Austin Shanghai Jiao Tong University July 19, 2005 July 19, 2005 1 Research Areas Manufacturing
More informationOn the HPDP from architecture to a device. Final Presentation Days ESTEC, May 9 th 2017
On the HPDP from architecture to a device Final Presentation Days ESTEC, May 9 th 2017 Outline Introduction HPDP Architecture Top Design Comparisons Target applications Design Flow Operating Environment
More informationAs for Dec, BSK 2015 Business Line Cards
As for Dec, 2014 BSK 2015 Business Line Cards Bigshine Korea History 1989 ~ 1999 1999 Bigshine Chicago 현지법인설립 1998 NexFlash 사와 Agency Contract InNet 사와 Agency Contract 1997 Bigshine Korea Co., Ltd. 상호변경
More informationFOR MULTIMEDIA mobile systems powered by a battery
IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 7, NO. 1, FEBRUARY 2005 67 ITRON-LP: Power-Conscious Real-Time OS Based on Cooperative Voltage Scaling for Multimedia Applications Hiroshi Kawaguchi, Member, IEEE,
More informationCertus TM Silicon Debug: Don t Prototype Without It by Doug Amos, Mentor Graphics
Certus TM Silicon Debug: Don t Prototype Without It by Doug Amos, Mentor Graphics FPGA PROTOTYPE RUNNING NOW WHAT? Well done team; we ve managed to get 100 s of millions of gates of FPGA-hostile RTL running
More informationA Novel Macroblock-Level Filtering Upsampling Architecture for H.264/AVC Scalable Extension
05-Silva-AF:05-Silva-AF 8/19/11 6:18 AM Page 43 A Novel Macroblock-Level Filtering Upsampling Architecture for H.264/AVC Scalable Extension T. L. da Silva 1, L. A. S. Cruz 2, and L. V. Agostini 3 1 Telecommunications
More informationA Novel VLSI Architecture of Motion Compensation for Multiple Standards
A Novel VLSI Architecture of Motion Compensation for Multiple Standards Junhao Zheng, Wen Gao, Senior Member, IEEE, David Wu, and Don Xie Abstract Motion compensation (MC) is one of the most important
More informationChapter 1. Introduction to Digital Signal Processing
Chapter 1 Introduction to Digital Signal Processing 1. Introduction Signal processing is a discipline concerned with the acquisition, representation, manipulation, and transformation of signals required
More informationLOW POWER AND HIGH PERFORMANCE SHIFT REGISTERS USING PULSED LATCH TECHNIQUE
OI: 10.21917/ijme.2018.0088 LOW POWER AN HIGH PERFORMANCE SHIFT REGISTERS USING PULSE LATCH TECHNIUE Vandana Niranjan epartment of Electronics and Communication Engineering, Indira Gandhi elhi Technical
More informationLow Power Design of the Next-Generation High Efficiency Video Coding
Low Power Design of the Next-Generation High Efficiency Video Coding Authors: Muhammad Shafique, Jörg Henkel CES Chair for Embedded Systems Outline Introduction to the High Efficiency Video Coding (HEVC)
More informationSRAM Based Random Number Generator For Non-Repeating Pattern Generation
Applied Mechanics and Materials Online: 2014-06-18 ISSN: 1662-7482, Vol. 573, pp 181-186 doi:10.4028/www.scientific.net/amm.573.181 2014 Trans Tech Publications, Switzerland SRAM Based Random Number Generator
More informationTHE new video coding standard H.264/AVC [1] significantly
832 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 53, NO. 9, SEPTEMBER 2006 Architecture Design of Context-Based Adaptive Variable-Length Coding for H.264/AVC Tung-Chien Chen, Yu-Wen
More informationFrame Processing Time Deviations in Video Processors
Tensilica White Paper Frame Processing Time Deviations in Video Processors May, 2008 1 Executive Summary Chips are increasingly made with processor designs licensed as semiconductor IP (intellectual property).
More informationMPEG decoder Case. K.A. Vissers UC Berkeley Chamleon Systems Inc. and Pieter van der Wolf. Philips Research Eindhoven, The Netherlands
MPEG decoder Case K.A. Vissers UC Berkeley Chamleon Systems Inc. and Pieter van der Wolf Philips Research Eindhoven, The Netherlands 1 Outline Introduction Consumer Electronics Kahn Process Networks Revisited
More information