Efficient GPU Synchronization without Scopes: Saying No to Complex Consistency Models
|
|
- Dylan Earl McKenzie
- 6 years ago
- Views:
Transcription
1 Efficient GPU Synchronization without Scopes: Saying No to Complex Consistency Models Matthew D. Sinclair, Johnathan Alsop, Sarita V. Adve University of Urbana-Champaign hetero@cs.illinois.edu
2 Motivation Heterogeneous systems now used for a wide variety of applications Emerging applications have fine-grained synchronization BUT current GPUs have sub-optimal consistency and coherence Defacto Recent Consistency Data-race-free (DRF) Simple Heterogeneous-race-free (HRF) Scoped synchronization Complex Coherence High overhead on synchs Inefficient No overhead for local synchs Efficient for local synch This work: simple consistency + efficient coherence 2
3 Motivation (Cont.) Do GPU models (HRF) need to be more complex than CPU models (DRF)? NO! Not if coherence is done right! DeNovo+DRF: Efficient AND simpler memory model Comparable or better results vs. GPU+DRF and GPU+HRF complex consistency models 3
4 Outline Motivation Coherence Protocols and Consistency Models Classification GPU Coherence DeNovo Coherence Coherence and Consistency Summary Results Conclusion 4
5 A Classification of Coherence Protocols Read hit: Don t return stale data Read miss: Find one up-to-date copy Invalidator Writer Reader Track up-todate copy Ownership Writethrough MESI DeNovo GPU Reader-initiated invalidations No invalidation or ack traffic, directories, transient states Obtaining ownership for written data Reuse owned data across synchs (not flushed at synch points) 5
6 GPU Coherence with DRF Flush Invalidate dirty all data GPU Valid Dirty Cache Valid CPU Cache L2 Cache Bank L2 Cache Bank Interconnection n/w With data-race-free (DRF) memory model No data races; synchs must be explicitly distinguished At all synch points Flush all dirty data: Unnecessary writethroughs Invalidate all data: Can t reuse data across synch points Synchronization accesses must go to last level cache (LLC) 6
7 GPU Coherence with HRF heterogeneous HRF With data-race-free (DRF) memory model [ASPLOS 14] No data races; synchs must be explicitly distinguished heterogeneous and their scopes At all synch points global Global Flush all dirty data: Unnecessary writethroughs Invalidate all data: Can t reuse data across synch points Synchronization accesses must go to last level cache (LLC) No overhead for locally scoped synchs But higher programming complexity 7
8 DeNovo Coherence with DRF Invalidate Obtain non-owned ownership data GPU Own Dirty Cache Valid CPU Cache L2 Cache Bank L2 Cache Bank Interconnection n/w With data-race-free (DRF) memory model No data races; synchs must be explicitly distinguished At all synch points Flush all dirty data Obtain ownership for dirty data Invalidate all non-owned data Can reuse owned data can be performed at L1 Synchronization accesses must go to last level cache (LLC) 3% state overhead vs. GPU coherence + HRF 8
9 DeNovo Configurations Studied DeNovo+DRF: Invalidate all non-owned data at synch points DeNovo-RO+DRF: Avoids invalidating read-only data at synch points DeNovo+HRF: Reuse valid data if synch is locally scoped 9
10 Coherence & Consistency Summary Coherence + Consistency GPU + DRF () Reuse Data Owned Valid Do Synchs at L1 X X X GPU + HRF () local local local DeNovo + DRF () X DeNovo-RO + DRF () read-only DeNovo + HRF () local 10
11 Outline Motivation Coherence Protocols and Consistency Models Results Conclusion 11
12 Evaluation Methodology 1 CPU core + 15 GPU compute units (CU) Each node has private L1, scratchpad, tile of shared L2 Simulation Environment GEMS, Simics, Garnet, GPGPU-Sim, GPUWattch, McPAT Workloads 10 apps from Rodinia, Parboil: no fine-grained synch DeNovo and GPU coherence perform comparably UC-Davis microbenchmarks + UTS from HRF paper: Mutex, semaphore, barrier, work sharing Shows potential for future apps Created two versions of each: globally, locally/hybrid scoped synch 12
13 Global Synch Execution Time 100% FAM SLM SPM SPMBO AVG 80% 60% 40% 20% 0% G* D* G* D* G* D* G* D* G* D* DeNovo has 28% lower execution time than GPU with global synch 13
14 Global Synch Energy 100% N/W L2 $ L1 D$ Scratch GPU Core+ FAM SLM SPM SPMBO AVG 80% 60% 40% 20% 0% G* D* G* D* G* D* G* D* G* D* DeNovo has 51% lower energy than GPU with global synch 14
15 Local Synch Execution Time 100% FAM SLM SPM SPMBO SS SSBO TBEX TB UTS AVG 80% 60% 40% 20% 0% GPU+HRF is much better than GPU+DRF with local synch [ASPLOS 14] 15
16 Local Synch Execution Time 100% FAM SLM SPM SPMBO SS SSBO TBEX TB UTS AVG 80% 60% 40% 20% 0% GPU+HRF is much better than GPU+DRF with local synch [ASPLOS 14] DeNovo+DRF comparable to GPU+HRF, but simpler consistency model 16
17 Local Synch Execution Time 100% FAM SLM SPM SPMBO SS SSBO TBEX TB UTS AVG 80% 60% 40% 20% 0% GPU+HRF is much better than GPU+DRF with local synch [ASPLOS 14] DeNovo+DRF comparable to GPU+HRF, but simpler consistency model DeNovo-RO+DRF reduces gap by not invalidating read-only data 17
18 Local Synch Execution Time 100% FAM SLM SPM SPMBO SS SSBO TBEX TB UTS AVG 80% 60% 40% 20% 0% GPU+HRF is much better than GPU+DRF with local synch [ASPLOS 14] DeNovo+DRF comparable to GPU+HRF, but simpler consistency model DeNovo-RO+DRF reduces gap by not invalidating read-only data DeNovo+HRF is best, if consistency complexity acceptable 18
19 Local Synch Energy 100% N/W L2 $ L1 D$ Scratch GPU Core+ FAM SLM SPM SPMBO SS SSBO TBEX TB UTS AVG 80% 60% 40% 20% 0% Energy trends similar to execution time 19
20 Conclusions Emerging heterogeneous apps use fine-grained synch GPU coherence + DRF: inefficient, but simple memory model GPU coherence + HRF: efficient, but complex memory model Do GPU models (HRF) need to be more complex than CPU models (DRF)? DeNovo + DRF: efficient AND simple memory model complex consistency models! 20
Amdahl s Law in the Multicore Era
Amdahl s Law in the Multicore Era Mark D. Hill and Michael R. Marty University of Wisconsin Madison August 2008 @ Semiahmoo Workshop IBM s Dr. Thomas Puzak: Everyone knows Amdahl s Law 2008 Multifacet
More informationTransparent low-overhead checkpoint for GPU-accelerated clusters
Transparent low-overhead checkpoint for GPU-accelerated clusters Leonardo BAUTISTA GOMEZ 1,3, Akira NUKADA 1, Naoya MARUYAMA 1, Franck CAPPELLO 3,4, Satoshi MATSUOKA 1,2 1 Tokyo Institute of Technology,
More informationEfficient Reconciliation and Flow Control for Anti-Entropy Protocols
Efficient Reconciliation and Flow Control for Anti-Entropy Protocols Robbert van Renesse Dan Dumitriu Valient Gough Chris Thomas Work done at Amazon.com (2006) Gossip at Amazon Ubiquitous Monitoring
More informationAging test: integrated vs. non-integrated splices shield continuity systems.
Aging test: integrated vs. non-integrated splices shield continuity systems. George Fofeldea Power Engineer, 3M Canada November 2018 Abstract To maximize long-term splice performance, the implications
More informationPRACE Autumn School GPU Programming
PRACE Autumn School 2010 GPU Programming October 25-29, 2010 PRACE Autumn School, Oct 2010 1 Outline GPU Programming Track Tuesday 26th GPGPU: General-purpose GPU Programming CUDA Architecture, Threading
More informationImpact of Intermittent Faults on Nanocomputing Devices
Impact of Intermittent Faults on Nanocomputing Devices Cristian Constantinescu June 28th, 2007 Dependable Systems and Networks Outline Fault classes Permanent faults Transient faults Intermittent faults
More informationdata and is used in digital networks and storage devices. CRC s are easy to implement in binary
Introduction Cyclic redundancy check (CRC) is an error detecting code designed to detect changes in transmitted data and is used in digital networks and storage devices. CRC s are easy to implement in
More informationCS8803: Advanced Digital Design for Embedded Hardware
CS883: Advanced Digital Design for Embedded Hardware Lecture 4: Latches, Flip-Flops, and Sequential Circuits Instructor: Sung Kyu Lim (limsk@ece.gatech.edu) Website: http://users.ece.gatech.edu/limsk/course/cs883
More informationT : Internet Technologies for Mobile Computing
T-110.7111: Internet Technologies for Mobile Computing Overview of IoT Platforms Julien Mineraud Post-doctoral researcher University of Helsinki, Finland Wednesday, the 9th of March 2016 Julien Mineraud
More informationLab2: Cache Memories. Dimitar Nikolov
Lab2: Cache Memories Dimitar Nikolov Goal Understand how cache memories work Learn how different cache-mappings impact CPU time Leran how different cache-sizes impact CPU time Lund University / Electrical
More informationRedEye Analog ConvNet Image Sensor Architecture for Continuous Mobile Vision
Analog ConvNet Image Sensor Architecture for Continuous Mobile Vision Robert LiKamWa Yunhui Hou Yuan Gao Mia Polansky Lin Zhong roblkw@rice.edu houyh@rice.edu yg18@rice.edu mia.polansky@rice.edu lzhong@rice.edu
More informationSlack Redistribution for Graceful Degradation Under Voltage Overscaling
Slack Redistribution for Graceful Degradation Under Voltage Overscaling Andrew B. Kahng, Seokhyeong Kang, Rakesh Kumar and John Sartori VLSI CAD LABORATORY, UCSD PASSAT GROUP, UIUC UCSD VLSI CAD Laboratory
More informationLecture 2: Digi Logic & Bus
Lecture 2 http://www.du.edu/~etuttle/electron/elect36.htm Flip-Flop (kiikku) Sequential Circuits, Bus Online Ch 20.1-3 [Sta10] Ch 3 [Sta10] Circuits with memory What moves on Bus? Flip-Flop S-R Latch PCI-bus
More informationADVANCED MICRO DEVICES, 2 CADENCE DESIGN SYSTEMS
METHODOLOGY FOR ANALYZING AND QUANTIFYING DESIGN STYLE CHANGES AND COMPLEXITY USING TOPOLOGICAL PATTERNS JASON CAIN 1, YA-CHIEH LAI 2, FRANK GENNARI 2, JASON SWEIS 2 1 ADVANCED MICRO DEVICES, 2 CADENCE
More informationSoC IC Basics. COE838: Systems on Chip Design
SoC IC Basics COE838: Systems on Chip Design http://www.ee.ryerson.ca/~courses/coe838/ Dr. Gul N. Khan http://www.ee.ryerson.ca/~gnkhan Electrical and Computer Engineering Ryerson University Overview SoC
More informationOn the Characterization of Distributed Virtual Environment Systems
On the Characterization of Distributed Virtual Environment Systems P. Morillo, J. M. Orduña, M. Fernández and J. Duato Departamento de Informática. Universidad de Valencia. SPAIN DISCA. Universidad Politécnica
More information100Gb/s Single-lane SERDES Discussion. Phil Sun, Credo Semiconductor IEEE New Ethernet Applications Ad Hoc May 24, 2017
100Gb/s Single-lane SERDES Discussion Phil Sun, Credo Semiconductor IEEE 802.3 New Ethernet Applications Ad Hoc May 24, 2017 Introduction This contribution tries to share thoughts on 100Gb/s single-lane
More informationProfiling techniques for parallel applications
Profiling techniques for parallel applications Analyzing program performance with HPCToolkit 03/10/2016 PRACE Autumn School 2016 2 Introduction Focus of this session Profiling of parallel applications
More informationModel- based design of energy- efficient applications for IoT systems
Model- based design of energy- efficient applications for IoT systems Alexios Lekidis, Panagiotis Katsaros Department of Informatics, Aristotle University of Thessaloniki 1st International Workshop on
More informationITU-T Y Specific requirements and capabilities of the Internet of things for big data
I n t e r n a t i o n a l T e l e c o m m u n i c a t i o n U n i o n ITU-T Y.4114 TELECOMMUNICATION STANDARDIZATION SECTOR OF ITU (07/2017) SERIES Y: GLOBAL INFORMATION INFRASTRUCTURE, INTERNET PROTOCOL
More informationCPS311 Lecture: Sequential Circuits
CPS311 Lecture: Sequential Circuits Last revised August 4, 2015 Objectives: 1. To introduce asynchronous and synchronous flip-flops (latches and pulsetriggered, plus asynchronous preset/clear) 2. To introduce
More informationVID_OVERLAY. Digital Video Overlay Module Rev Key Design Features. Block Diagram. Applications. Pin-out Description
Key Design Features Block Diagram Synthesizable, technology independent VHDL IP Core Video overlays on 24-bit RGB or YCbCr 4:4:4 video Supports all video resolutions up to 2 16 x 2 16 pixels Supports any
More informationFuture of Analog Design and Upcoming Challenges in Nanometer CMOS
Future of Analog Design and Upcoming Challenges in Nanometer CMOS Greg Taylor VLSI Design 2010 Outline Introduction Logic processing trends Analog design trends Analog design challenge Approaches Conclusion
More informationA leading global media studio achieves their longtime goal: seamless digital operations
A leading global media studio achieves their longtime goal: seamless digital operations RadicalMedia is a multi-disciplinary company that crafts notably moving content such as feature films, television,
More informationProfiling techniques for parallel applications
Profiling techniques for parallel applications Analyzing program performance with HPCToolkit 17/04/2014 PRACE Spring School 2014 2 Introduction Thomas Ponweiser Johannes Kepler University Linz (JKU) Involved
More informationGPU s for High Performance Signal Processing in Infrared Camera System
GPU s for High Performance Signal Processing in Infrared Camera System Stefan Olsson, PhD Senior Company Specialist-Video Processing Project Manager at FLIR 2015-05-28 Instruments Automation/Process Monitoring
More informationGeneralized Pattern Matching Micro-Engine
Generalized Pattern Matching Micro-Engine Yuanwei Fang*, Raihan Rasool, Dilip Vasudevan*, Andrew A. Chien* University of Chicago * Argonne National Laboratory King Faisal University Big Data Applications
More informationFullMAX Air Inetrface Parameters for Upper 700 MHz A Block v1.0
FullMAX Air Inetrface Parameters for Upper 700 MHz A Block v1.0 March 23, 2015 By Menashe Shahar, CTO, Full Spectrum Inc. This document describes the FullMAX Air Interface Parameters for operation in the
More informationPROTOTYPE OF IOT ENABLED SMART FACTORY. HaeKyung Lee and Taioun Kim. Received September 2015; accepted November 2015
ICIC Express Letters Part B: Applications ICIC International c 2016 ISSN 2185-2766 Volume 7, Number 4(tentative), April 2016 pp. 1 ICICIC2015-SS21-06 PROTOTYPE OF IOT ENABLED SMART FACTORY HaeKyung Lee
More informationRiccardo Farinelli. Charge Centroid Feasibility
Riccardo Farinelli Charge Centroid Feasibility Outline Prototype and TB setup Data set studied Analysis approch Results Charge Centroid Feasibility Ferrara July 07, 2015 R.Farinelli 2 Test chambers Conversion
More informationLossless Compression Algorithms for Direct- Write Lithography Systems
Lossless Compression Algorithms for Direct- Write Lithography Systems Hsin-I Liu Video and Image Processing Lab Department of Electrical Engineering and Computer Science University of California at Berkeley
More informationThe CLEO-III Trigger: Analog and Digital Calorimetry
The CLEO-III Trigger: Analog and Digital Calorimetry George Gollin University of Illinois at Urbana-Champaign Nuclear Science Symposium and Medical Imaging Conference, Lyon, France, October 15-20, 2000
More informationfrom ocean to cloud ADAPTING THE C&A PROCESS FOR COHERENT TECHNOLOGY
ADAPTING THE C&A PROCESS FOR COHERENT TECHNOLOGY Peter Booi (Verizon), Jamie Gaudette (Ciena Corporation), and Mark André (France Telecom Orange) Email: Peter.Booi@nl.verizon.com Verizon, 123 H.J.E. Wenckebachweg,
More informationBARC Digital ASI APAC 2017
BARC Digital ASI APAC 2017 BARC Stands Out as TV JIC Measurement System Built Upon 30+ Vendors Measuring 550+ Watermarked Channels Launched Data in Under 20 months Indigenous Meter @ $400, now looking
More informationDatasheet. 5 GHz airmax AC AP. Models: LAP-120, LAP-GPS. High-Performance Sector AP. Up To 450+ Mbps Real TCP/IP Throughput
5 GHz airmax AC AP Models: LAP-120, High-Performance Sector AP Up To 450+ Mbps Real TCP/IP Throughput Lightweight, Low-Cost Solution Application Examples Introducing the airmax LiteAP AC, the latest high-performance
More informationR&S BCDRIVE R&S ETC-K930 Broadcast Drive Test Manual
R&S BCDRIVE R&S ETC-K930 Broadcast Drive Test Manual 2115.1347.02 05 Broadcast and Media Manual The Manual describes the following R&S Broadcast Drive Test software. 2115.1360.02 2115.1360.03 2116.5146.02
More informationSequential Logic. Introduction to Computer Yung-Yu Chuang
Sequential Logic Introduction to Computer Yung-Yu Chuang with slides by Sedgewick & Wayne (introcs.cs.princeton.edu), Nisan & Schocken (www.nand2tetris.org) and Harris & Harris (DDCA) Review of Combinational
More informationAn Optimized Diffusion Depth Of Field Solver (DDOF)
An Optimized Diffusion Depth Of Field Solver (DDOF) Holger Gruen AMD 28th February 2011 AMD s Favorite Effects 2 Agenda Motivation Recap of a high-level explanation of DDOF Recap of earlier DDOF solvers
More informationTestability: Lecture 23 Design for Testability (DFT) Slide 1 of 43
Testability: Lecture 23 Design for Testability (DFT) Shaahin hi Hessabi Department of Computer Engineering Sharif University of Technology Adapted, with modifications, from lecture notes prepared p by
More informationSharif University of Technology. SoC: Introduction
SoC Design Lecture 1: Introduction Shaahin Hessabi Department of Computer Engineering System-on-Chip System: a set of related parts that act as a whole to achieve a given goal. A system is a set of interacting
More informationSub-LVDS-to-Parallel Sensor Bridge
January 2015 Introduction Reference Design RD1122 Sony introduced the IMX036 and IMX136 sensors to support resolutions up to 1080P60 and 1080p120 respectively. A traditional CMOS parallel interface could
More informationEEC 116 Fall 2011 Lab #5: Pipelined 32b Adder
EEC 116 Fall 2011 Lab #5: Pipelined 32b Adder Dept. of Electrical and Computer Engineering University of California, Davis Issued: November 2, 2011 Due: November 16, 2011, 4PM Reading: Rabaey Sections
More informationANSI C63.4 and CISPR 22-Harmony
ANSI C63.4 and CISPR 22-Harmony at Last? Donald N. Heirman Lucent Technologies, Bell Laboratories Innovations Holmdel, New Jersey 07738 USA Abstract: This paper compares the most prevalent emission measurement
More informationA High-Performance Parallel CAVLC Encoder on a Fine-Grained Many-core System
A High-Performance Parallel CAVLC Encoder on a Fine-Grained Many-core System Zhibin Xiao and Bevan M. Baas VLSI Computation Lab, ECE Department University of California, Davis Outline Introduction to H.264
More informationFingerShadow: An OLED Power Optimization based on Smartphone Touch Interactions
FingerShadow: An OLED Power Optimization based on Smartphone Touch Interactions Xiang Chen, Kent W. Nixon, Hucheng Zhou, Yunxin Liu, Yiran Chen Microsoft Research Beijing, China 100080 {huzho, yunliu}@microsoft.com
More informationProcesses for the Intersection
7 Timing Processes for the Intersection In Chapter 6, you studied the operation of one intersection approach and determined the value of the vehicle extension time that would extend the green for as long
More informationMirth Solutions. Powering Healthcare Transformation.
Mirth Solutions Powering Healthcare Transformation. You re on a mission to... Eliminate costly information gaps and duplications that make it hard to integrate information and achieve interoperability.
More informationScalability of MB-level Parallelism for H.264 Decoding
Scalability of Macroblock-level Parallelism for H.264 Decoding Mauricio Alvarez Mesa 1, Alex Ramírez 1,2, Mateo Valero 1,2, Arnaldo Azevedo 3, Cor Meenderinck 3, Ben Juurlink 3 1 Universitat Politècnica
More informationDesign for Testability
TDTS 01 Lecture 9 Design for Testability Zebo Peng Embedded Systems Laboratory IDA, Linköping University Lecture 9 The test problems Fault modeling Design for testability techniques Zebo Peng, IDA, LiTH
More informationEN2911X: Reconfigurable Computing Topic 01: Programmable Logic. Prof. Sherief Reda School of Engineering, Brown University Fall 2014
EN2911X: Reconfigurable Computing Topic 01: Programmable Logic Prof. Sherief Reda School of Engineering, Brown University Fall 2014 1 Contents 1. Architecture of modern FPGAs Programmable interconnect
More informationE6607A EXT Wireless Communications Test Set. Non-signaling Test Overview. Application Note
E6607A EXT Wireless Communications Test Set Non-signaling Test Overview Application Note Introduction Contents Introduction 2 Emergence of Non-Signaling Test 3 The importance of chipset test modes 3 Transition
More informationIntroduction to HSR&PRP. HSR&PRP Basics
Introduction to HSR&PRP HSR&PRP Basics Content What are HSR&PRP? Why HSR&PRP? History How it works HSR vs PRP HSR&PRP with PTP What are HSR&PRP? High vailability Seamless Redundancy (HSR) standardized
More informationPowering Collaboration and Innovation in the Simulation Design Flow Agilent EEsof Design Forum 2010
Powering Collaboration and Innovation in the Simulation Design Flow Agilent EEsof Design Forum 2010 Channel Simulator and AMI model support within ADS Page 1 Contributors to this Paper José Luis Pino,
More informationMilestone Leverages Intel Processors with Intel Quick Sync Video to Create Breakthrough Capabilities for Video Surveillance and Monitoring
white paper Milestone Leverages Intel Processors with Intel Quick Sync Video to Create Breakthrough Capabilities for Video Surveillance and Monitoring Executive Summary Milestone Systems, the world s leading
More informationSession 3.2. Network planning at different time scales, long, medium and short term. Network planning at different time scales:
ITU-BDT Regional Network Planning Workshop Cairo Egypt, 16-27 July 2006 Session 3.2 Network planning at different time scales, long, medium and short term Network Planning Workshop with Tool Case Studies
More informationCHAPTER 6 ASYNCHRONOUS QUASI DELAY INSENSITIVE TEMPLATES (QDI) BASED VITERBI DECODER
80 CHAPTER 6 ASYNCHRONOUS QUASI DELAY INSENSITIVE TEMPLATES (QDI) BASED VITERBI DECODER 6.1 INTRODUCTION Asynchronous designs are increasingly used to counter the disadvantages of synchronous designs.
More informationCS8803: Advanced Digital Design for Embedded Hardware
Copyright 2, 23 M Ciletti 75 STORAGE ELEMENTS: R-S LATCH CS883: Advanced igital esign for Embedded Hardware Storage elements are used to store information in a binary format (e.g. state, data, address,
More information4. Formal Equivalence Checking
4. Formal Equivalence Checking 1 4. Formal Equivalence Checking Jacob Abraham Department of Electrical and Computer Engineering The University of Texas at Austin Verification of Digital Systems Spring
More informationRFSOI and FDSOI enabling smarter and IoT applications. Kirk Ouellette Digital Products Group STMicroelectronics
RFSOI and FDSOI enabling smarter and IoT applications Kirk Ouellette Digital Products Group STMicroelectronics ST in the IoT already Today 2 Kirk Ouellette More then Moore Workshop - Shanghai - March 17,
More informationHigh Performance Microprocessor Design and Automation: Overview, Challenges and Opportunities IBM Corporation
High Performance Microprocessor Design and Automation: Overview, Challenges and Opportunities Introduction About Myself What to expect out of this lecture Understand the current trend in the IC Design
More informationMoving Content to Wireless Edge
Moving Content to Wireless Edge - Improving Backhaul-limited Small Cell Performance - Dragan Samardzija February, 2013 Discussions with Markos Tavares, Howard Huang, Mohammadali Ali, Ivica Rimac, Reinaldo
More informationVector IRAM Memory Performance for Image Access Patterns Richard M. Fromm Report No. UCB/CSD-99-1067 October 1999 Computer Science Division (EECS) University of California Berkeley, California 94720 Vector
More informationSelf-Test and Adaptation for Random Variations in Reliability
Self-Test and Adaptation for Random Variations in Reliability Kenneth M. Zick and John P. Hayes University of Michigan, Ann Arbor, MI USA August 31, 2010 Motivation Physical variation is increasing dramatically
More informationInternet of Things (IoT)
Internet of Things (IoT) Aims of this session Define IoT Understanding the technology behind IoT Analysis of Operational aspects of IoT Understanding IoT business models Explore the policy and regulatory
More information1. Structure of the paper: 2. Title
A Special Guide for Authors Periodica Polytechnica Electrical Engineering and Computer Science VINMES Special Issue - Novel trends in electronics technology This special guide for authors has been developed
More informationLong and Fast Up/Down Counters Pushpinder Kaur CHOUHAN 6 th Jan, 2003
1 Introduction Long and Fast Up/Down Counters Pushpinder Kaur CHOUHAN 6 th Jan, 2003 Circuits for counting both forward and backward events are frequently used in computers and other digital systems. Digital
More informationDigital Video Subcommittee SCTE STANDARD SCTE
Digital Video Subcommittee SCTE STANDARD Program-Specific Ad Insertion - Traffic System to Ad Insertion System File Format Specification NOTICE The Society of Cable Telecommunications Engineers (SCTE)
More informationLow Power Design of the Next-Generation High Efficiency Video Coding
Low Power Design of the Next-Generation High Efficiency Video Coding Authors: Muhammad Shafique, Jörg Henkel CES Chair for Embedded Systems Outline Introduction to the High Efficiency Video Coding (HEVC)
More informationLogiCORE IP AXI Video Direct Memory Access v5.01.a
LogiCORE IP AXI Video Direct Memory Access v5.01.a Product Guide Table of Contents Chapter 1: Overview Feature Summary.................................................................. 9 Applications.....................................................................
More informationEEE ALERT signal for 100GBASE-KP4
EEE ALERT signal for 100GBASE-KP4 Matt Brown, AppliedMicro Bart Zeydel, AppliedMicro Adee Ran, Intel Kent Lusted, Intel (Regarding Comments 39 and 10234) 1 Supporters Brad Booth, Dell Rich Mellitz, Intel
More informationISSCC 2003 / SESSION 19 / PROCESSOR BUILDING BLOCKS / PAPER 19.5
ISSCC 2003 / SESSION 19 / PROCESSOR BUILDING BLOCKS / PAPER 19.5 19.5 A Clock Skew Absorbing Flip-Flop Nikola Nedovic 1,2, Vojin G. Oklobdzija 2, William W. Walker 1 1 Fujitsu Laboratories of America,
More informationCOMP2611: Computer Organization. Introduction to Digital Logic
1 COMP2611: Computer Organization Sequential Logic Time 2 Till now, we have essentially ignored the issue of time. We assume digital circuits: Perform their computations instantaneously Stateless: once
More informationPerformance Analysis of Broadcasting Algorithms on the Intel Single-Chip Cloud Computer
Performance Analysis of Broadcasting Algorithms on the Intel Single-Chip Cloud Computer John Matienzo, Natalie Enright Jerger Department of Electrical and Computer Engineering University of Toronto Toronto,
More informationLEARN TO BE AN EXPERT FROM THE EXPERTS IN CABLE TECHNOLOGY.
LEARN TO BE AN EXPERT FROM THE EXPERTS IN CABLE TECHNOLOGY. EARN SCTE ISBE CERTIFICATIONS TODAY! :: Prove your Cable Knowledge :: Gain Recognition for your Skills :: Promote your Expertise :: Advance your
More informationK.T. Tim Cheng 07_dft, v Testability
K.T. Tim Cheng 07_dft, v1.0 1 Testability Is concept that deals with costs associated with testing. Increase testability of a circuit Some test cost is being reduced Test application time Test generation
More informationRAZOR: CIRCUIT-LEVEL CORRECTION OF TIMING ERRORS FOR LOW-POWER OPERATION
RAZOR: CIRCUIT-LEVEL CORRECTION OF TIMING ERRORS FOR LOW-POWER OPERATION Shohaib Aboobacker TU München 22 nd March 2011 Based on Razor: A Low-Power Pipeline Based on Circuit-Level Timing Speculation Dan
More informationInternational Journal of Engineering Trends and Technology (IJETT) - Volume4 Issue8- August 2013
International Journal of Engineering Trends and Technology (IJETT) - Volume4 Issue8- August 2013 Design and Implementation of an Enhanced LUT System in Security Based Computation dama.dhanalakshmi 1, K.Annapurna
More informationFunctional Diagram: Figure 1 PCIe4-SIO8BX-SYNC Block Diagram. Chan 1-4. Multi-protocol Transceiver. 32kb. Receiver FIFO. 32kb.
PCIe4-SIO8BX-SYNC High Speed Eight Channel Synchronous Serial to Parallel Controller Featuring RS485/RS232 Serial I/O (Software Configurable) and 32k Byte FIFO Buffers (512k Byte total) The PCIe4-SI08BX-SYNC
More informationLEARN TO BE AN EXPERT FROM THE EXPERTS IN CABLE TECHNOLOGY.
HOW DO YOU TURN YOURSELF INTO A CABLE TECHNOLOGY EXPERT? TURN TO THE EXPERTS. SCTE ISBE is the industry leader in developing certified experts. And you can be next. Learn core network technologies by taking
More informationLEARN TO BE AN EXPERT FROM THE EXPERTS IN CABLE TECHNOLOGY.
HOW DO YOU TURN YOURSELF INTO A CABLE TECHNOLOGY EXPERT? TURN TO THE EXPERTS. SCTE ISBE is the industry leader in developing certified experts. And you can be next. Learn core network technologies by taking
More informationAMOLED compensation circuit patent analysis
IHS Electronics & Media Key Patent Report AMOLED compensation circuit patent analysis AMOLED pixel driving circuit with threshold voltage and IR-drop compensation July 2013 ihs.com Ian Lim, Senior Analyst,
More informationBaBar Grid. Tim Adye Particle Physics Department Rutherford Appleton Laboratory. PP Grid Team Coseners House 8 th November 2002
BaBar Grid Tim Adye Particle Physics Department Rutherford Appleton Laboratory PP Grid Team Coseners House 8 th November 2002 8th November 2002 Tim Adye 1 Talk Plan BaBar distributed computing model RAL
More informationFinFETs & SRAM Design
FinFETs & SRAM Design Raymond Leung VP Engineering, Embedded Memories April 19, 2013 Synopsys 2013 1 Agenda FinFET the Device SRAM Design with FinFETs Reliability in FinFETs Summary Synopsys 2013 2 How
More informationAUTOMATIC LICENSE PLATE RECOGNITION(ALPR) ON EMBEDDED SYSTEM
AUTOMATIC LICENSE PLATE RECOGNITION(ALPR) ON EMBEDDED SYSTEM Presented by Guanghan APPLICATIONS 1. Automatic toll collection 2. Traffic law enforcement 3. Parking lot access control 4. Road traffic monitoring
More informationDigital to Mixed-Signal Verification of Power Management SOCs Using Questa-ADMS. M. Behaghel
Digital to Mixed-Signal Verification of Power Management SOCs Using Questa-ADMS M. Behaghel A global leader in wireless technologies Leading supplier of platforms and semiconductors for wireless devices
More informationImplementation of an MPEG Codec on the Tilera TM 64 Processor
1 Implementation of an MPEG Codec on the Tilera TM 64 Processor Whitney Flohr Supervisor: Mark Franklin, Ed Richter Department of Electrical and Systems Engineering Washington University in St. Louis Fall
More informationSamsung QMD Series SMART Signage
Data sheet Samsung QMD Series SMART Signage Display business messaging in ultra-realistic detail Highlights Deliver your business messaging 16/7 in Ultra-High-Definition (UHD) resolution with ultimate
More informationA Low-Power 0.7-V H p Video Decoder
A Low-Power 0.7-V H.264 720p Video Decoder D. Finchelstein, V. Sze, M.E. Sinangil, Y. Koken, A.P. Chandrakasan A-SSCC 2008 Outline Motivation for low-power video decoders Low-power techniques pipelining
More informationThe Internet-of-Things For Biodiversity
The Internet-of-Things For Biodiversity Adam T. Drobot Wayne, PA 19087 Outline What: About IoT Aspects of IoT Key ingredients Dealing with Complexity The basic ingredients for IoT Examples of IoT that
More informationResearch Article. Implementation of Low Power, Delay and Area Efficient Shifters for Memory Based Computation
International Journal of Modern Science and Technology Vol. 2, No. 5, 2017. Page 217-222. http://www.ijmst.co/ ISSN: 2456-0235. Research Article Implementation of Low Power, Delay and Area Efficient Shifters
More informationDEDICATED TO EMBEDDED SOLUTIONS
DEDICATED TO EMBEDDED SOLUTIONS DESIGN SAFE FPGA INTERNAL CLOCK DOMAIN CROSSINGS ESPEN TALLAKSEN DATA RESPONS SCOPE Clock domain crossings (CDC) is probably the worst source for serious FPGA-bugs that
More informationDisruptive Weather Conditions: Clouds in the Forecast Welcome!
Disruptive Weather Conditions: Clouds in the Forecast Welcome! SMPTE Educational Webcast Sponsors Thank you to our sponsors for their generous support of SMPTE and the SMPTE Professional Development Academy:
More informationDevice Management Requirements
Device Management Requirements Approved Version 2.0 09 Feb 2016 Open Mobile Alliance OMA-RD-DM-V2_0-20160209-A [OMA-Template-ReqDoc-20160101-I] OMA-RD-DM-V2_0-20160209-A Page 2 (14) Use of this document
More informationVicon Valerus Performance Guide
Vicon Valerus Performance Guide General With the release of the Valerus VMS, Vicon has introduced and offers a flexible and powerful display performance algorithm. Valerus allows using multiple monitors
More informationOn-Supporting Energy Balanced K-Barrier Coverage In Wireless Sensor Networks
On-Supporting Energy Balanced K-Barrier Coverage In Wireless Sensor Networks Chih-Yung Chang cychang@mail.tku.edu.t w Li-Ling Hung Aletheia University llhung@mail.au.edu.tw Yu-Chieh Chen ycchen@wireless.cs.tk
More informationGROUNDBREAKING INNOVATIONS FOR DYNAMIC LIGHTING
GROUNDBREAKING INNOVATIONS FOR DYNAMIC LIGHTING LIGHTING CONTROL SHOULD BE EASY IN ANY KIND OF WAY The complexity of a lighting installation may never be a limitation for its feasibility. That is where
More informationPrimary Frequency Response Ancillary Service Market Designs
Engineering Conferences International ECI Digital Archives Modeling, Simulation, And Optimization for the 21st Century Electric Power Grid Proceedings Fall 10-24-2012 Primary Frequency Response Ancillary
More informationDYNAMIC VOLTAGE SCALING TECHNIQUES FOR POWER-EFFICIENT MPEG DECODING WISSAM CHEDID
DYNAMIC VOLTAGE SCALING TECHNIQUES FOR POWER-EFFICIENT MPEG DECODING WISSAM CHEDID Bachelor of Science in Electrical Engineering Lebanese University, Lebanon June, 2001 Submitted in partial fulfillment
More informationRevising Technical Manuscripts, Celia M. Elliott. 12 May 2014
12 May 2014 Three disclaimers: I am not a scientist I m a science writer and technical editor. The author trumps the editor every time. (But you really should listen to us; we have your best interests
More informationPaper review on Mobile Fronthaul Networks
Paper review on Mobile Fronthaul Networks Wei Wang BUPT Ph.d candidate & UC Davis visiting student Email: weiw@bupt.edu.cn, waywang@ucdavis.edu Group Meeting, July. 14, 2017 Contents What is Mobile Fronthaul
More information