Efficient GPU Synchronization without Scopes: Saying No to Complex Consistency Models

Size: px
Start display at page:

Download "Efficient GPU Synchronization without Scopes: Saying No to Complex Consistency Models"

Transcription

1 Efficient GPU Synchronization without Scopes: Saying No to Complex Consistency Models Matthew D. Sinclair, Johnathan Alsop, Sarita V. Adve University of Urbana-Champaign hetero@cs.illinois.edu

2 Motivation Heterogeneous systems now used for a wide variety of applications Emerging applications have fine-grained synchronization BUT current GPUs have sub-optimal consistency and coherence Defacto Recent Consistency Data-race-free (DRF) Simple Heterogeneous-race-free (HRF) Scoped synchronization Complex Coherence High overhead on synchs Inefficient No overhead for local synchs Efficient for local synch This work: simple consistency + efficient coherence 2

3 Motivation (Cont.) Do GPU models (HRF) need to be more complex than CPU models (DRF)? NO! Not if coherence is done right! DeNovo+DRF: Efficient AND simpler memory model Comparable or better results vs. GPU+DRF and GPU+HRF complex consistency models 3

4 Outline Motivation Coherence Protocols and Consistency Models Classification GPU Coherence DeNovo Coherence Coherence and Consistency Summary Results Conclusion 4

5 A Classification of Coherence Protocols Read hit: Don t return stale data Read miss: Find one up-to-date copy Invalidator Writer Reader Track up-todate copy Ownership Writethrough MESI DeNovo GPU Reader-initiated invalidations No invalidation or ack traffic, directories, transient states Obtaining ownership for written data Reuse owned data across synchs (not flushed at synch points) 5

6 GPU Coherence with DRF Flush Invalidate dirty all data GPU Valid Dirty Cache Valid CPU Cache L2 Cache Bank L2 Cache Bank Interconnection n/w With data-race-free (DRF) memory model No data races; synchs must be explicitly distinguished At all synch points Flush all dirty data: Unnecessary writethroughs Invalidate all data: Can t reuse data across synch points Synchronization accesses must go to last level cache (LLC) 6

7 GPU Coherence with HRF heterogeneous HRF With data-race-free (DRF) memory model [ASPLOS 14] No data races; synchs must be explicitly distinguished heterogeneous and their scopes At all synch points global Global Flush all dirty data: Unnecessary writethroughs Invalidate all data: Can t reuse data across synch points Synchronization accesses must go to last level cache (LLC) No overhead for locally scoped synchs But higher programming complexity 7

8 DeNovo Coherence with DRF Invalidate Obtain non-owned ownership data GPU Own Dirty Cache Valid CPU Cache L2 Cache Bank L2 Cache Bank Interconnection n/w With data-race-free (DRF) memory model No data races; synchs must be explicitly distinguished At all synch points Flush all dirty data Obtain ownership for dirty data Invalidate all non-owned data Can reuse owned data can be performed at L1 Synchronization accesses must go to last level cache (LLC) 3% state overhead vs. GPU coherence + HRF 8

9 DeNovo Configurations Studied DeNovo+DRF: Invalidate all non-owned data at synch points DeNovo-RO+DRF: Avoids invalidating read-only data at synch points DeNovo+HRF: Reuse valid data if synch is locally scoped 9

10 Coherence & Consistency Summary Coherence + Consistency GPU + DRF () Reuse Data Owned Valid Do Synchs at L1 X X X GPU + HRF () local local local DeNovo + DRF () X DeNovo-RO + DRF () read-only DeNovo + HRF () local 10

11 Outline Motivation Coherence Protocols and Consistency Models Results Conclusion 11

12 Evaluation Methodology 1 CPU core + 15 GPU compute units (CU) Each node has private L1, scratchpad, tile of shared L2 Simulation Environment GEMS, Simics, Garnet, GPGPU-Sim, GPUWattch, McPAT Workloads 10 apps from Rodinia, Parboil: no fine-grained synch DeNovo and GPU coherence perform comparably UC-Davis microbenchmarks + UTS from HRF paper: Mutex, semaphore, barrier, work sharing Shows potential for future apps Created two versions of each: globally, locally/hybrid scoped synch 12

13 Global Synch Execution Time 100% FAM SLM SPM SPMBO AVG 80% 60% 40% 20% 0% G* D* G* D* G* D* G* D* G* D* DeNovo has 28% lower execution time than GPU with global synch 13

14 Global Synch Energy 100% N/W L2 $ L1 D$ Scratch GPU Core+ FAM SLM SPM SPMBO AVG 80% 60% 40% 20% 0% G* D* G* D* G* D* G* D* G* D* DeNovo has 51% lower energy than GPU with global synch 14

15 Local Synch Execution Time 100% FAM SLM SPM SPMBO SS SSBO TBEX TB UTS AVG 80% 60% 40% 20% 0% GPU+HRF is much better than GPU+DRF with local synch [ASPLOS 14] 15

16 Local Synch Execution Time 100% FAM SLM SPM SPMBO SS SSBO TBEX TB UTS AVG 80% 60% 40% 20% 0% GPU+HRF is much better than GPU+DRF with local synch [ASPLOS 14] DeNovo+DRF comparable to GPU+HRF, but simpler consistency model 16

17 Local Synch Execution Time 100% FAM SLM SPM SPMBO SS SSBO TBEX TB UTS AVG 80% 60% 40% 20% 0% GPU+HRF is much better than GPU+DRF with local synch [ASPLOS 14] DeNovo+DRF comparable to GPU+HRF, but simpler consistency model DeNovo-RO+DRF reduces gap by not invalidating read-only data 17

18 Local Synch Execution Time 100% FAM SLM SPM SPMBO SS SSBO TBEX TB UTS AVG 80% 60% 40% 20% 0% GPU+HRF is much better than GPU+DRF with local synch [ASPLOS 14] DeNovo+DRF comparable to GPU+HRF, but simpler consistency model DeNovo-RO+DRF reduces gap by not invalidating read-only data DeNovo+HRF is best, if consistency complexity acceptable 18

19 Local Synch Energy 100% N/W L2 $ L1 D$ Scratch GPU Core+ FAM SLM SPM SPMBO SS SSBO TBEX TB UTS AVG 80% 60% 40% 20% 0% Energy trends similar to execution time 19

20 Conclusions Emerging heterogeneous apps use fine-grained synch GPU coherence + DRF: inefficient, but simple memory model GPU coherence + HRF: efficient, but complex memory model Do GPU models (HRF) need to be more complex than CPU models (DRF)? DeNovo + DRF: efficient AND simple memory model complex consistency models! 20

Amdahl s Law in the Multicore Era

Amdahl s Law in the Multicore Era Amdahl s Law in the Multicore Era Mark D. Hill and Michael R. Marty University of Wisconsin Madison August 2008 @ Semiahmoo Workshop IBM s Dr. Thomas Puzak: Everyone knows Amdahl s Law 2008 Multifacet

More information

Transparent low-overhead checkpoint for GPU-accelerated clusters

Transparent low-overhead checkpoint for GPU-accelerated clusters Transparent low-overhead checkpoint for GPU-accelerated clusters Leonardo BAUTISTA GOMEZ 1,3, Akira NUKADA 1, Naoya MARUYAMA 1, Franck CAPPELLO 3,4, Satoshi MATSUOKA 1,2 1 Tokyo Institute of Technology,

More information

Efficient Reconciliation and Flow Control for Anti-Entropy Protocols

Efficient Reconciliation and Flow Control for Anti-Entropy Protocols Efficient Reconciliation and Flow Control for Anti-Entropy Protocols Robbert van Renesse Dan Dumitriu Valient Gough Chris Thomas Work done at Amazon.com (2006) Gossip at Amazon Ubiquitous Monitoring

More information

Aging test: integrated vs. non-integrated splices shield continuity systems.

Aging test: integrated vs. non-integrated splices shield continuity systems. Aging test: integrated vs. non-integrated splices shield continuity systems. George Fofeldea Power Engineer, 3M Canada November 2018 Abstract To maximize long-term splice performance, the implications

More information

PRACE Autumn School GPU Programming

PRACE Autumn School GPU Programming PRACE Autumn School 2010 GPU Programming October 25-29, 2010 PRACE Autumn School, Oct 2010 1 Outline GPU Programming Track Tuesday 26th GPGPU: General-purpose GPU Programming CUDA Architecture, Threading

More information

Impact of Intermittent Faults on Nanocomputing Devices

Impact of Intermittent Faults on Nanocomputing Devices Impact of Intermittent Faults on Nanocomputing Devices Cristian Constantinescu June 28th, 2007 Dependable Systems and Networks Outline Fault classes Permanent faults Transient faults Intermittent faults

More information

data and is used in digital networks and storage devices. CRC s are easy to implement in binary

data and is used in digital networks and storage devices. CRC s are easy to implement in binary Introduction Cyclic redundancy check (CRC) is an error detecting code designed to detect changes in transmitted data and is used in digital networks and storage devices. CRC s are easy to implement in

More information

CS8803: Advanced Digital Design for Embedded Hardware

CS8803: Advanced Digital Design for Embedded Hardware CS883: Advanced Digital Design for Embedded Hardware Lecture 4: Latches, Flip-Flops, and Sequential Circuits Instructor: Sung Kyu Lim (limsk@ece.gatech.edu) Website: http://users.ece.gatech.edu/limsk/course/cs883

More information

T : Internet Technologies for Mobile Computing

T : Internet Technologies for Mobile Computing T-110.7111: Internet Technologies for Mobile Computing Overview of IoT Platforms Julien Mineraud Post-doctoral researcher University of Helsinki, Finland Wednesday, the 9th of March 2016 Julien Mineraud

More information

Lab2: Cache Memories. Dimitar Nikolov

Lab2: Cache Memories. Dimitar Nikolov Lab2: Cache Memories Dimitar Nikolov Goal Understand how cache memories work Learn how different cache-mappings impact CPU time Leran how different cache-sizes impact CPU time Lund University / Electrical

More information

RedEye Analog ConvNet Image Sensor Architecture for Continuous Mobile Vision

RedEye Analog ConvNet Image Sensor Architecture for Continuous Mobile Vision Analog ConvNet Image Sensor Architecture for Continuous Mobile Vision Robert LiKamWa Yunhui Hou Yuan Gao Mia Polansky Lin Zhong roblkw@rice.edu houyh@rice.edu yg18@rice.edu mia.polansky@rice.edu lzhong@rice.edu

More information

Slack Redistribution for Graceful Degradation Under Voltage Overscaling

Slack Redistribution for Graceful Degradation Under Voltage Overscaling Slack Redistribution for Graceful Degradation Under Voltage Overscaling Andrew B. Kahng, Seokhyeong Kang, Rakesh Kumar and John Sartori VLSI CAD LABORATORY, UCSD PASSAT GROUP, UIUC UCSD VLSI CAD Laboratory

More information

Lecture 2: Digi Logic & Bus

Lecture 2: Digi Logic & Bus Lecture 2 http://www.du.edu/~etuttle/electron/elect36.htm Flip-Flop (kiikku) Sequential Circuits, Bus Online Ch 20.1-3 [Sta10] Ch 3 [Sta10] Circuits with memory What moves on Bus? Flip-Flop S-R Latch PCI-bus

More information

ADVANCED MICRO DEVICES, 2 CADENCE DESIGN SYSTEMS

ADVANCED MICRO DEVICES, 2 CADENCE DESIGN SYSTEMS METHODOLOGY FOR ANALYZING AND QUANTIFYING DESIGN STYLE CHANGES AND COMPLEXITY USING TOPOLOGICAL PATTERNS JASON CAIN 1, YA-CHIEH LAI 2, FRANK GENNARI 2, JASON SWEIS 2 1 ADVANCED MICRO DEVICES, 2 CADENCE

More information

SoC IC Basics. COE838: Systems on Chip Design

SoC IC Basics. COE838: Systems on Chip Design SoC IC Basics COE838: Systems on Chip Design http://www.ee.ryerson.ca/~courses/coe838/ Dr. Gul N. Khan http://www.ee.ryerson.ca/~gnkhan Electrical and Computer Engineering Ryerson University Overview SoC

More information

On the Characterization of Distributed Virtual Environment Systems

On the Characterization of Distributed Virtual Environment Systems On the Characterization of Distributed Virtual Environment Systems P. Morillo, J. M. Orduña, M. Fernández and J. Duato Departamento de Informática. Universidad de Valencia. SPAIN DISCA. Universidad Politécnica

More information

100Gb/s Single-lane SERDES Discussion. Phil Sun, Credo Semiconductor IEEE New Ethernet Applications Ad Hoc May 24, 2017

100Gb/s Single-lane SERDES Discussion. Phil Sun, Credo Semiconductor IEEE New Ethernet Applications Ad Hoc May 24, 2017 100Gb/s Single-lane SERDES Discussion Phil Sun, Credo Semiconductor IEEE 802.3 New Ethernet Applications Ad Hoc May 24, 2017 Introduction This contribution tries to share thoughts on 100Gb/s single-lane

More information

Profiling techniques for parallel applications

Profiling techniques for parallel applications Profiling techniques for parallel applications Analyzing program performance with HPCToolkit 03/10/2016 PRACE Autumn School 2016 2 Introduction Focus of this session Profiling of parallel applications

More information

Model- based design of energy- efficient applications for IoT systems

Model- based design of energy- efficient applications for IoT systems Model- based design of energy- efficient applications for IoT systems Alexios Lekidis, Panagiotis Katsaros Department of Informatics, Aristotle University of Thessaloniki 1st International Workshop on

More information

ITU-T Y Specific requirements and capabilities of the Internet of things for big data

ITU-T Y Specific requirements and capabilities of the Internet of things for big data I n t e r n a t i o n a l T e l e c o m m u n i c a t i o n U n i o n ITU-T Y.4114 TELECOMMUNICATION STANDARDIZATION SECTOR OF ITU (07/2017) SERIES Y: GLOBAL INFORMATION INFRASTRUCTURE, INTERNET PROTOCOL

More information

CPS311 Lecture: Sequential Circuits

CPS311 Lecture: Sequential Circuits CPS311 Lecture: Sequential Circuits Last revised August 4, 2015 Objectives: 1. To introduce asynchronous and synchronous flip-flops (latches and pulsetriggered, plus asynchronous preset/clear) 2. To introduce

More information

VID_OVERLAY. Digital Video Overlay Module Rev Key Design Features. Block Diagram. Applications. Pin-out Description

VID_OVERLAY. Digital Video Overlay Module Rev Key Design Features. Block Diagram. Applications. Pin-out Description Key Design Features Block Diagram Synthesizable, technology independent VHDL IP Core Video overlays on 24-bit RGB or YCbCr 4:4:4 video Supports all video resolutions up to 2 16 x 2 16 pixels Supports any

More information

Future of Analog Design and Upcoming Challenges in Nanometer CMOS

Future of Analog Design and Upcoming Challenges in Nanometer CMOS Future of Analog Design and Upcoming Challenges in Nanometer CMOS Greg Taylor VLSI Design 2010 Outline Introduction Logic processing trends Analog design trends Analog design challenge Approaches Conclusion

More information

A leading global media studio achieves their longtime goal: seamless digital operations

A leading global media studio achieves their longtime goal: seamless digital operations A leading global media studio achieves their longtime goal: seamless digital operations RadicalMedia is a multi-disciplinary company that crafts notably moving content such as feature films, television,

More information

Profiling techniques for parallel applications

Profiling techniques for parallel applications Profiling techniques for parallel applications Analyzing program performance with HPCToolkit 17/04/2014 PRACE Spring School 2014 2 Introduction Thomas Ponweiser Johannes Kepler University Linz (JKU) Involved

More information

GPU s for High Performance Signal Processing in Infrared Camera System

GPU s for High Performance Signal Processing in Infrared Camera System GPU s for High Performance Signal Processing in Infrared Camera System Stefan Olsson, PhD Senior Company Specialist-Video Processing Project Manager at FLIR 2015-05-28 Instruments Automation/Process Monitoring

More information

Generalized Pattern Matching Micro-Engine

Generalized Pattern Matching Micro-Engine Generalized Pattern Matching Micro-Engine Yuanwei Fang*, Raihan Rasool, Dilip Vasudevan*, Andrew A. Chien* University of Chicago * Argonne National Laboratory King Faisal University Big Data Applications

More information

FullMAX Air Inetrface Parameters for Upper 700 MHz A Block v1.0

FullMAX Air Inetrface Parameters for Upper 700 MHz A Block v1.0 FullMAX Air Inetrface Parameters for Upper 700 MHz A Block v1.0 March 23, 2015 By Menashe Shahar, CTO, Full Spectrum Inc. This document describes the FullMAX Air Interface Parameters for operation in the

More information

PROTOTYPE OF IOT ENABLED SMART FACTORY. HaeKyung Lee and Taioun Kim. Received September 2015; accepted November 2015

PROTOTYPE OF IOT ENABLED SMART FACTORY. HaeKyung Lee and Taioun Kim. Received September 2015; accepted November 2015 ICIC Express Letters Part B: Applications ICIC International c 2016 ISSN 2185-2766 Volume 7, Number 4(tentative), April 2016 pp. 1 ICICIC2015-SS21-06 PROTOTYPE OF IOT ENABLED SMART FACTORY HaeKyung Lee

More information

Riccardo Farinelli. Charge Centroid Feasibility

Riccardo Farinelli. Charge Centroid Feasibility Riccardo Farinelli Charge Centroid Feasibility Outline Prototype and TB setup Data set studied Analysis approch Results Charge Centroid Feasibility Ferrara July 07, 2015 R.Farinelli 2 Test chambers Conversion

More information

Lossless Compression Algorithms for Direct- Write Lithography Systems

Lossless Compression Algorithms for Direct- Write Lithography Systems Lossless Compression Algorithms for Direct- Write Lithography Systems Hsin-I Liu Video and Image Processing Lab Department of Electrical Engineering and Computer Science University of California at Berkeley

More information

The CLEO-III Trigger: Analog and Digital Calorimetry

The CLEO-III Trigger: Analog and Digital Calorimetry The CLEO-III Trigger: Analog and Digital Calorimetry George Gollin University of Illinois at Urbana-Champaign Nuclear Science Symposium and Medical Imaging Conference, Lyon, France, October 15-20, 2000

More information

from ocean to cloud ADAPTING THE C&A PROCESS FOR COHERENT TECHNOLOGY

from ocean to cloud ADAPTING THE C&A PROCESS FOR COHERENT TECHNOLOGY ADAPTING THE C&A PROCESS FOR COHERENT TECHNOLOGY Peter Booi (Verizon), Jamie Gaudette (Ciena Corporation), and Mark André (France Telecom Orange) Email: Peter.Booi@nl.verizon.com Verizon, 123 H.J.E. Wenckebachweg,

More information

BARC Digital ASI APAC 2017

BARC Digital ASI APAC 2017 BARC Digital ASI APAC 2017 BARC Stands Out as TV JIC Measurement System Built Upon 30+ Vendors Measuring 550+ Watermarked Channels Launched Data in Under 20 months Indigenous Meter @ $400, now looking

More information

Datasheet. 5 GHz airmax AC AP. Models: LAP-120, LAP-GPS. High-Performance Sector AP. Up To 450+ Mbps Real TCP/IP Throughput

Datasheet. 5 GHz airmax AC AP. Models: LAP-120, LAP-GPS. High-Performance Sector AP. Up To 450+ Mbps Real TCP/IP Throughput 5 GHz airmax AC AP Models: LAP-120, High-Performance Sector AP Up To 450+ Mbps Real TCP/IP Throughput Lightweight, Low-Cost Solution Application Examples Introducing the airmax LiteAP AC, the latest high-performance

More information

R&S BCDRIVE R&S ETC-K930 Broadcast Drive Test Manual

R&S BCDRIVE R&S ETC-K930 Broadcast Drive Test Manual R&S BCDRIVE R&S ETC-K930 Broadcast Drive Test Manual 2115.1347.02 05 Broadcast and Media Manual The Manual describes the following R&S Broadcast Drive Test software. 2115.1360.02 2115.1360.03 2116.5146.02

More information

Sequential Logic. Introduction to Computer Yung-Yu Chuang

Sequential Logic. Introduction to Computer Yung-Yu Chuang Sequential Logic Introduction to Computer Yung-Yu Chuang with slides by Sedgewick & Wayne (introcs.cs.princeton.edu), Nisan & Schocken (www.nand2tetris.org) and Harris & Harris (DDCA) Review of Combinational

More information

An Optimized Diffusion Depth Of Field Solver (DDOF)

An Optimized Diffusion Depth Of Field Solver (DDOF) An Optimized Diffusion Depth Of Field Solver (DDOF) Holger Gruen AMD 28th February 2011 AMD s Favorite Effects 2 Agenda Motivation Recap of a high-level explanation of DDOF Recap of earlier DDOF solvers

More information

Testability: Lecture 23 Design for Testability (DFT) Slide 1 of 43

Testability: Lecture 23 Design for Testability (DFT) Slide 1 of 43 Testability: Lecture 23 Design for Testability (DFT) Shaahin hi Hessabi Department of Computer Engineering Sharif University of Technology Adapted, with modifications, from lecture notes prepared p by

More information

Sharif University of Technology. SoC: Introduction

Sharif University of Technology. SoC: Introduction SoC Design Lecture 1: Introduction Shaahin Hessabi Department of Computer Engineering System-on-Chip System: a set of related parts that act as a whole to achieve a given goal. A system is a set of interacting

More information

Sub-LVDS-to-Parallel Sensor Bridge

Sub-LVDS-to-Parallel Sensor Bridge January 2015 Introduction Reference Design RD1122 Sony introduced the IMX036 and IMX136 sensors to support resolutions up to 1080P60 and 1080p120 respectively. A traditional CMOS parallel interface could

More information

EEC 116 Fall 2011 Lab #5: Pipelined 32b Adder

EEC 116 Fall 2011 Lab #5: Pipelined 32b Adder EEC 116 Fall 2011 Lab #5: Pipelined 32b Adder Dept. of Electrical and Computer Engineering University of California, Davis Issued: November 2, 2011 Due: November 16, 2011, 4PM Reading: Rabaey Sections

More information

ANSI C63.4 and CISPR 22-Harmony

ANSI C63.4 and CISPR 22-Harmony ANSI C63.4 and CISPR 22-Harmony at Last? Donald N. Heirman Lucent Technologies, Bell Laboratories Innovations Holmdel, New Jersey 07738 USA Abstract: This paper compares the most prevalent emission measurement

More information

A High-Performance Parallel CAVLC Encoder on a Fine-Grained Many-core System

A High-Performance Parallel CAVLC Encoder on a Fine-Grained Many-core System A High-Performance Parallel CAVLC Encoder on a Fine-Grained Many-core System Zhibin Xiao and Bevan M. Baas VLSI Computation Lab, ECE Department University of California, Davis Outline Introduction to H.264

More information

FingerShadow: An OLED Power Optimization based on Smartphone Touch Interactions

FingerShadow: An OLED Power Optimization based on Smartphone Touch Interactions FingerShadow: An OLED Power Optimization based on Smartphone Touch Interactions Xiang Chen, Kent W. Nixon, Hucheng Zhou, Yunxin Liu, Yiran Chen Microsoft Research Beijing, China 100080 {huzho, yunliu}@microsoft.com

More information

Processes for the Intersection

Processes for the Intersection 7 Timing Processes for the Intersection In Chapter 6, you studied the operation of one intersection approach and determined the value of the vehicle extension time that would extend the green for as long

More information

Mirth Solutions. Powering Healthcare Transformation.

Mirth Solutions. Powering Healthcare Transformation. Mirth Solutions Powering Healthcare Transformation. You re on a mission to... Eliminate costly information gaps and duplications that make it hard to integrate information and achieve interoperability.

More information

Scalability of MB-level Parallelism for H.264 Decoding

Scalability of MB-level Parallelism for H.264 Decoding Scalability of Macroblock-level Parallelism for H.264 Decoding Mauricio Alvarez Mesa 1, Alex Ramírez 1,2, Mateo Valero 1,2, Arnaldo Azevedo 3, Cor Meenderinck 3, Ben Juurlink 3 1 Universitat Politècnica

More information

Design for Testability

Design for Testability TDTS 01 Lecture 9 Design for Testability Zebo Peng Embedded Systems Laboratory IDA, Linköping University Lecture 9 The test problems Fault modeling Design for testability techniques Zebo Peng, IDA, LiTH

More information

EN2911X: Reconfigurable Computing Topic 01: Programmable Logic. Prof. Sherief Reda School of Engineering, Brown University Fall 2014

EN2911X: Reconfigurable Computing Topic 01: Programmable Logic. Prof. Sherief Reda School of Engineering, Brown University Fall 2014 EN2911X: Reconfigurable Computing Topic 01: Programmable Logic Prof. Sherief Reda School of Engineering, Brown University Fall 2014 1 Contents 1. Architecture of modern FPGAs Programmable interconnect

More information

E6607A EXT Wireless Communications Test Set. Non-signaling Test Overview. Application Note

E6607A EXT Wireless Communications Test Set. Non-signaling Test Overview. Application Note E6607A EXT Wireless Communications Test Set Non-signaling Test Overview Application Note Introduction Contents Introduction 2 Emergence of Non-Signaling Test 3 The importance of chipset test modes 3 Transition

More information

Introduction to HSR&PRP. HSR&PRP Basics

Introduction to HSR&PRP. HSR&PRP Basics Introduction to HSR&PRP HSR&PRP Basics Content What are HSR&PRP? Why HSR&PRP? History How it works HSR vs PRP HSR&PRP with PTP What are HSR&PRP? High vailability Seamless Redundancy (HSR) standardized

More information

Powering Collaboration and Innovation in the Simulation Design Flow Agilent EEsof Design Forum 2010

Powering Collaboration and Innovation in the Simulation Design Flow Agilent EEsof Design Forum 2010 Powering Collaboration and Innovation in the Simulation Design Flow Agilent EEsof Design Forum 2010 Channel Simulator and AMI model support within ADS Page 1 Contributors to this Paper José Luis Pino,

More information

Milestone Leverages Intel Processors with Intel Quick Sync Video to Create Breakthrough Capabilities for Video Surveillance and Monitoring

Milestone Leverages Intel Processors with Intel Quick Sync Video to Create Breakthrough Capabilities for Video Surveillance and Monitoring white paper Milestone Leverages Intel Processors with Intel Quick Sync Video to Create Breakthrough Capabilities for Video Surveillance and Monitoring Executive Summary Milestone Systems, the world s leading

More information

Session 3.2. Network planning at different time scales, long, medium and short term. Network planning at different time scales:

Session 3.2. Network planning at different time scales, long, medium and short term. Network planning at different time scales: ITU-BDT Regional Network Planning Workshop Cairo Egypt, 16-27 July 2006 Session 3.2 Network planning at different time scales, long, medium and short term Network Planning Workshop with Tool Case Studies

More information

CHAPTER 6 ASYNCHRONOUS QUASI DELAY INSENSITIVE TEMPLATES (QDI) BASED VITERBI DECODER

CHAPTER 6 ASYNCHRONOUS QUASI DELAY INSENSITIVE TEMPLATES (QDI) BASED VITERBI DECODER 80 CHAPTER 6 ASYNCHRONOUS QUASI DELAY INSENSITIVE TEMPLATES (QDI) BASED VITERBI DECODER 6.1 INTRODUCTION Asynchronous designs are increasingly used to counter the disadvantages of synchronous designs.

More information

CS8803: Advanced Digital Design for Embedded Hardware

CS8803: Advanced Digital Design for Embedded Hardware Copyright 2, 23 M Ciletti 75 STORAGE ELEMENTS: R-S LATCH CS883: Advanced igital esign for Embedded Hardware Storage elements are used to store information in a binary format (e.g. state, data, address,

More information

4. Formal Equivalence Checking

4. Formal Equivalence Checking 4. Formal Equivalence Checking 1 4. Formal Equivalence Checking Jacob Abraham Department of Electrical and Computer Engineering The University of Texas at Austin Verification of Digital Systems Spring

More information

RFSOI and FDSOI enabling smarter and IoT applications. Kirk Ouellette Digital Products Group STMicroelectronics

RFSOI and FDSOI enabling smarter and IoT applications. Kirk Ouellette Digital Products Group STMicroelectronics RFSOI and FDSOI enabling smarter and IoT applications Kirk Ouellette Digital Products Group STMicroelectronics ST in the IoT already Today 2 Kirk Ouellette More then Moore Workshop - Shanghai - March 17,

More information

High Performance Microprocessor Design and Automation: Overview, Challenges and Opportunities IBM Corporation

High Performance Microprocessor Design and Automation: Overview, Challenges and Opportunities IBM Corporation High Performance Microprocessor Design and Automation: Overview, Challenges and Opportunities Introduction About Myself What to expect out of this lecture Understand the current trend in the IC Design

More information

Moving Content to Wireless Edge

Moving Content to Wireless Edge Moving Content to Wireless Edge - Improving Backhaul-limited Small Cell Performance - Dragan Samardzija February, 2013 Discussions with Markos Tavares, Howard Huang, Mohammadali Ali, Ivica Rimac, Reinaldo

More information

Vector IRAM Memory Performance for Image Access Patterns Richard M. Fromm Report No. UCB/CSD-99-1067 October 1999 Computer Science Division (EECS) University of California Berkeley, California 94720 Vector

More information

Self-Test and Adaptation for Random Variations in Reliability

Self-Test and Adaptation for Random Variations in Reliability Self-Test and Adaptation for Random Variations in Reliability Kenneth M. Zick and John P. Hayes University of Michigan, Ann Arbor, MI USA August 31, 2010 Motivation Physical variation is increasing dramatically

More information

Internet of Things (IoT)

Internet of Things (IoT) Internet of Things (IoT) Aims of this session Define IoT Understanding the technology behind IoT Analysis of Operational aspects of IoT Understanding IoT business models Explore the policy and regulatory

More information

1. Structure of the paper: 2. Title

1. Structure of the paper: 2. Title A Special Guide for Authors Periodica Polytechnica Electrical Engineering and Computer Science VINMES Special Issue - Novel trends in electronics technology This special guide for authors has been developed

More information

Long and Fast Up/Down Counters Pushpinder Kaur CHOUHAN 6 th Jan, 2003

Long and Fast Up/Down Counters Pushpinder Kaur CHOUHAN 6 th Jan, 2003 1 Introduction Long and Fast Up/Down Counters Pushpinder Kaur CHOUHAN 6 th Jan, 2003 Circuits for counting both forward and backward events are frequently used in computers and other digital systems. Digital

More information

Digital Video Subcommittee SCTE STANDARD SCTE

Digital Video Subcommittee SCTE STANDARD SCTE Digital Video Subcommittee SCTE STANDARD Program-Specific Ad Insertion - Traffic System to Ad Insertion System File Format Specification NOTICE The Society of Cable Telecommunications Engineers (SCTE)

More information

Low Power Design of the Next-Generation High Efficiency Video Coding

Low Power Design of the Next-Generation High Efficiency Video Coding Low Power Design of the Next-Generation High Efficiency Video Coding Authors: Muhammad Shafique, Jörg Henkel CES Chair for Embedded Systems Outline Introduction to the High Efficiency Video Coding (HEVC)

More information

LogiCORE IP AXI Video Direct Memory Access v5.01.a

LogiCORE IP AXI Video Direct Memory Access v5.01.a LogiCORE IP AXI Video Direct Memory Access v5.01.a Product Guide Table of Contents Chapter 1: Overview Feature Summary.................................................................. 9 Applications.....................................................................

More information

EEE ALERT signal for 100GBASE-KP4

EEE ALERT signal for 100GBASE-KP4 EEE ALERT signal for 100GBASE-KP4 Matt Brown, AppliedMicro Bart Zeydel, AppliedMicro Adee Ran, Intel Kent Lusted, Intel (Regarding Comments 39 and 10234) 1 Supporters Brad Booth, Dell Rich Mellitz, Intel

More information

ISSCC 2003 / SESSION 19 / PROCESSOR BUILDING BLOCKS / PAPER 19.5

ISSCC 2003 / SESSION 19 / PROCESSOR BUILDING BLOCKS / PAPER 19.5 ISSCC 2003 / SESSION 19 / PROCESSOR BUILDING BLOCKS / PAPER 19.5 19.5 A Clock Skew Absorbing Flip-Flop Nikola Nedovic 1,2, Vojin G. Oklobdzija 2, William W. Walker 1 1 Fujitsu Laboratories of America,

More information

COMP2611: Computer Organization. Introduction to Digital Logic

COMP2611: Computer Organization. Introduction to Digital Logic 1 COMP2611: Computer Organization Sequential Logic Time 2 Till now, we have essentially ignored the issue of time. We assume digital circuits: Perform their computations instantaneously Stateless: once

More information

Performance Analysis of Broadcasting Algorithms on the Intel Single-Chip Cloud Computer

Performance Analysis of Broadcasting Algorithms on the Intel Single-Chip Cloud Computer Performance Analysis of Broadcasting Algorithms on the Intel Single-Chip Cloud Computer John Matienzo, Natalie Enright Jerger Department of Electrical and Computer Engineering University of Toronto Toronto,

More information

LEARN TO BE AN EXPERT FROM THE EXPERTS IN CABLE TECHNOLOGY.

LEARN TO BE AN EXPERT FROM THE EXPERTS IN CABLE TECHNOLOGY. LEARN TO BE AN EXPERT FROM THE EXPERTS IN CABLE TECHNOLOGY. EARN SCTE ISBE CERTIFICATIONS TODAY! :: Prove your Cable Knowledge :: Gain Recognition for your Skills :: Promote your Expertise :: Advance your

More information

K.T. Tim Cheng 07_dft, v Testability

K.T. Tim Cheng 07_dft, v Testability K.T. Tim Cheng 07_dft, v1.0 1 Testability Is concept that deals with costs associated with testing. Increase testability of a circuit Some test cost is being reduced Test application time Test generation

More information

RAZOR: CIRCUIT-LEVEL CORRECTION OF TIMING ERRORS FOR LOW-POWER OPERATION

RAZOR: CIRCUIT-LEVEL CORRECTION OF TIMING ERRORS FOR LOW-POWER OPERATION RAZOR: CIRCUIT-LEVEL CORRECTION OF TIMING ERRORS FOR LOW-POWER OPERATION Shohaib Aboobacker TU München 22 nd March 2011 Based on Razor: A Low-Power Pipeline Based on Circuit-Level Timing Speculation Dan

More information

International Journal of Engineering Trends and Technology (IJETT) - Volume4 Issue8- August 2013

International Journal of Engineering Trends and Technology (IJETT) - Volume4 Issue8- August 2013 International Journal of Engineering Trends and Technology (IJETT) - Volume4 Issue8- August 2013 Design and Implementation of an Enhanced LUT System in Security Based Computation dama.dhanalakshmi 1, K.Annapurna

More information

Functional Diagram: Figure 1 PCIe4-SIO8BX-SYNC Block Diagram. Chan 1-4. Multi-protocol Transceiver. 32kb. Receiver FIFO. 32kb.

Functional Diagram: Figure 1 PCIe4-SIO8BX-SYNC Block Diagram. Chan 1-4. Multi-protocol Transceiver. 32kb. Receiver FIFO. 32kb. PCIe4-SIO8BX-SYNC High Speed Eight Channel Synchronous Serial to Parallel Controller Featuring RS485/RS232 Serial I/O (Software Configurable) and 32k Byte FIFO Buffers (512k Byte total) The PCIe4-SI08BX-SYNC

More information

LEARN TO BE AN EXPERT FROM THE EXPERTS IN CABLE TECHNOLOGY.

LEARN TO BE AN EXPERT FROM THE EXPERTS IN CABLE TECHNOLOGY. HOW DO YOU TURN YOURSELF INTO A CABLE TECHNOLOGY EXPERT? TURN TO THE EXPERTS. SCTE ISBE is the industry leader in developing certified experts. And you can be next. Learn core network technologies by taking

More information

LEARN TO BE AN EXPERT FROM THE EXPERTS IN CABLE TECHNOLOGY.

LEARN TO BE AN EXPERT FROM THE EXPERTS IN CABLE TECHNOLOGY. HOW DO YOU TURN YOURSELF INTO A CABLE TECHNOLOGY EXPERT? TURN TO THE EXPERTS. SCTE ISBE is the industry leader in developing certified experts. And you can be next. Learn core network technologies by taking

More information

AMOLED compensation circuit patent analysis

AMOLED compensation circuit patent analysis IHS Electronics & Media Key Patent Report AMOLED compensation circuit patent analysis AMOLED pixel driving circuit with threshold voltage and IR-drop compensation July 2013 ihs.com Ian Lim, Senior Analyst,

More information

BaBar Grid. Tim Adye Particle Physics Department Rutherford Appleton Laboratory. PP Grid Team Coseners House 8 th November 2002

BaBar Grid. Tim Adye Particle Physics Department Rutherford Appleton Laboratory. PP Grid Team Coseners House 8 th November 2002 BaBar Grid Tim Adye Particle Physics Department Rutherford Appleton Laboratory PP Grid Team Coseners House 8 th November 2002 8th November 2002 Tim Adye 1 Talk Plan BaBar distributed computing model RAL

More information

FinFETs & SRAM Design

FinFETs & SRAM Design FinFETs & SRAM Design Raymond Leung VP Engineering, Embedded Memories April 19, 2013 Synopsys 2013 1 Agenda FinFET the Device SRAM Design with FinFETs Reliability in FinFETs Summary Synopsys 2013 2 How

More information

AUTOMATIC LICENSE PLATE RECOGNITION(ALPR) ON EMBEDDED SYSTEM

AUTOMATIC LICENSE PLATE RECOGNITION(ALPR) ON EMBEDDED SYSTEM AUTOMATIC LICENSE PLATE RECOGNITION(ALPR) ON EMBEDDED SYSTEM Presented by Guanghan APPLICATIONS 1. Automatic toll collection 2. Traffic law enforcement 3. Parking lot access control 4. Road traffic monitoring

More information

Digital to Mixed-Signal Verification of Power Management SOCs Using Questa-ADMS. M. Behaghel

Digital to Mixed-Signal Verification of Power Management SOCs Using Questa-ADMS. M. Behaghel Digital to Mixed-Signal Verification of Power Management SOCs Using Questa-ADMS M. Behaghel A global leader in wireless technologies Leading supplier of platforms and semiconductors for wireless devices

More information

Implementation of an MPEG Codec on the Tilera TM 64 Processor

Implementation of an MPEG Codec on the Tilera TM 64 Processor 1 Implementation of an MPEG Codec on the Tilera TM 64 Processor Whitney Flohr Supervisor: Mark Franklin, Ed Richter Department of Electrical and Systems Engineering Washington University in St. Louis Fall

More information

Samsung QMD Series SMART Signage

Samsung QMD Series SMART Signage Data sheet Samsung QMD Series SMART Signage Display business messaging in ultra-realistic detail Highlights Deliver your business messaging 16/7 in Ultra-High-Definition (UHD) resolution with ultimate

More information

A Low-Power 0.7-V H p Video Decoder

A Low-Power 0.7-V H p Video Decoder A Low-Power 0.7-V H.264 720p Video Decoder D. Finchelstein, V. Sze, M.E. Sinangil, Y. Koken, A.P. Chandrakasan A-SSCC 2008 Outline Motivation for low-power video decoders Low-power techniques pipelining

More information

The Internet-of-Things For Biodiversity

The Internet-of-Things For Biodiversity The Internet-of-Things For Biodiversity Adam T. Drobot Wayne, PA 19087 Outline What: About IoT Aspects of IoT Key ingredients Dealing with Complexity The basic ingredients for IoT Examples of IoT that

More information

Research Article. Implementation of Low Power, Delay and Area Efficient Shifters for Memory Based Computation

Research Article. Implementation of Low Power, Delay and Area Efficient Shifters for Memory Based Computation International Journal of Modern Science and Technology Vol. 2, No. 5, 2017. Page 217-222. http://www.ijmst.co/ ISSN: 2456-0235. Research Article Implementation of Low Power, Delay and Area Efficient Shifters

More information

DEDICATED TO EMBEDDED SOLUTIONS

DEDICATED TO EMBEDDED SOLUTIONS DEDICATED TO EMBEDDED SOLUTIONS DESIGN SAFE FPGA INTERNAL CLOCK DOMAIN CROSSINGS ESPEN TALLAKSEN DATA RESPONS SCOPE Clock domain crossings (CDC) is probably the worst source for serious FPGA-bugs that

More information

Disruptive Weather Conditions: Clouds in the Forecast Welcome!

Disruptive Weather Conditions: Clouds in the Forecast Welcome! Disruptive Weather Conditions: Clouds in the Forecast Welcome! SMPTE Educational Webcast Sponsors Thank you to our sponsors for their generous support of SMPTE and the SMPTE Professional Development Academy:

More information

Device Management Requirements

Device Management Requirements Device Management Requirements Approved Version 2.0 09 Feb 2016 Open Mobile Alliance OMA-RD-DM-V2_0-20160209-A [OMA-Template-ReqDoc-20160101-I] OMA-RD-DM-V2_0-20160209-A Page 2 (14) Use of this document

More information

Vicon Valerus Performance Guide

Vicon Valerus Performance Guide Vicon Valerus Performance Guide General With the release of the Valerus VMS, Vicon has introduced and offers a flexible and powerful display performance algorithm. Valerus allows using multiple monitors

More information

On-Supporting Energy Balanced K-Barrier Coverage In Wireless Sensor Networks

On-Supporting Energy Balanced K-Barrier Coverage In Wireless Sensor Networks On-Supporting Energy Balanced K-Barrier Coverage In Wireless Sensor Networks Chih-Yung Chang cychang@mail.tku.edu.t w Li-Ling Hung Aletheia University llhung@mail.au.edu.tw Yu-Chieh Chen ycchen@wireless.cs.tk

More information

GROUNDBREAKING INNOVATIONS FOR DYNAMIC LIGHTING

GROUNDBREAKING INNOVATIONS FOR DYNAMIC LIGHTING GROUNDBREAKING INNOVATIONS FOR DYNAMIC LIGHTING LIGHTING CONTROL SHOULD BE EASY IN ANY KIND OF WAY The complexity of a lighting installation may never be a limitation for its feasibility. That is where

More information

Primary Frequency Response Ancillary Service Market Designs

Primary Frequency Response Ancillary Service Market Designs Engineering Conferences International ECI Digital Archives Modeling, Simulation, And Optimization for the 21st Century Electric Power Grid Proceedings Fall 10-24-2012 Primary Frequency Response Ancillary

More information

DYNAMIC VOLTAGE SCALING TECHNIQUES FOR POWER-EFFICIENT MPEG DECODING WISSAM CHEDID

DYNAMIC VOLTAGE SCALING TECHNIQUES FOR POWER-EFFICIENT MPEG DECODING WISSAM CHEDID DYNAMIC VOLTAGE SCALING TECHNIQUES FOR POWER-EFFICIENT MPEG DECODING WISSAM CHEDID Bachelor of Science in Electrical Engineering Lebanese University, Lebanon June, 2001 Submitted in partial fulfillment

More information

Revising Technical Manuscripts, Celia M. Elliott. 12 May 2014

Revising Technical Manuscripts, Celia M. Elliott. 12 May 2014 12 May 2014 Three disclaimers: I am not a scientist I m a science writer and technical editor. The author trumps the editor every time. (But you really should listen to us; we have your best interests

More information

Paper review on Mobile Fronthaul Networks

Paper review on Mobile Fronthaul Networks Paper review on Mobile Fronthaul Networks Wei Wang BUPT Ph.d candidate & UC Davis visiting student Email: weiw@bupt.edu.cn, waywang@ucdavis.edu Group Meeting, July. 14, 2017 Contents What is Mobile Fronthaul

More information