GPU Acceleration of a Production Molecular Docking Code

Size: px
Start display at page:

Download "GPU Acceleration of a Production Molecular Docking Code"

Transcription

1 GPU Acceleration of a Production Molecular Docking Code Bharat Sukhwani Martin Herbordt Computer Architecture and Automated Design Laboratory Department of Electrical and Computer Engineering Boston University * This work supported, in part, by the U.S. NIH/NCRR + Thanks to Tom VanCourt (Altera) and Sandor Vajda and Dima Kozakov (BME at Boston University)

2 Why is Docking so important? Problem: Combat the bird flu virus Method: Inhibit its function by gumming up Neuraminidase, a surface protein, with an inhibitor - Neuraminidase helps release progeny viruses from the cell. Procedure*: - Search protein surface for likely sites - Find a molecule that binds there (and only there) # *Landon, et al. Chem. Biol. Drug Des 2008 # From New Scientist /channel/health/bird-flu 3/20/2009 GPGPU 2009, Washington DC 2

3 Overview of Molecular Docking Docking Modeling interactions between two molecules Computational Task Finding the least energy pose - Offset and rotation of one relative to the other e.g. Exhaustive search Usually performed in two steps - Docking Exhaustive sampling of 3D space - Energy minimization 3/20/2009 GPGPU 2009, Washington DC 3 Figure generated using PyMOL

4 Types of Docking Protein-Protein Docking Complex Structure prediction X-Ray method is difficult Typical grid size: 16 3 to Protein-Ligand Docking Used for drug discovery Screening millions of drug candidates In-silico screening is faster and more cost effective Typical ligand grid size: 4 3 to /20/2009 GPGPU 2009, Washington DC 4

5 Modeling Rigid Docking Rigid-body approximation Grid based computing Exhaustive 6D search Pose score = 3D correlation sum E (α,β,γ ) = R P (i, j,k) L P (i + α, j + β,k + γ ) p i, j,k FFT to speedup the correlation Reduces from O ( N ) to O( N log N) 6 3 Image courtesy of Structural Bioinformatics Lab, BU 3/20/2009 GPGPU 2009, Washington DC 5

6 Why Accelerate Docking? Rigid docking Tens of thousands of rotations Each requires multiple FFTs/ IFFTs Typically: 10 sec per rotation Total runtime ~ 98 hrs! Flexible docking adds another DoF Uses rigid docking as preprocessor or subroutine Faster docking would aid in drug discovery Faster screening (of millions of potential drug candidates) Better discrimination 3/20/2009 GPGPU 2009, Washington DC 6

7 Computations in Rigid Docking Rotation Increments of 5 to 15 degrees Grid assignment For each energy function Pose score FFT, Modulation and IFFT For each energy function Filtering top scores Selecting regional best scores 3/20/2009 GPGPU 2009, Washington DC 7

8 Overview of PIPER Docking Code Based on rigid molecule docking Also used as a subroutine in another program ClusPro docking and discrimination program Uses several energy functions Most sophisticated used in this type of code Core computation is 3D correlations (FFTs) For each energy function, for each rotation. Typical padded grid size = /20/2009 GPGPU 2009, Washington DC 8

9 PIPER Energy Functions Three energy functions Shape complementarity 2 terms Electrostatics 2 terms Pairwise Potential k terms k = 2 to 18 (usually 4) Combined in weighted sum E shape = E attr + w 1 E repul E = E + E elec born P 1 k= 0 coulomb E desol = E pairpot _ k E = E shape + w 2 E elec + w 3 E desol k + 4 correlations per rotation 3/20/2009 GPGPU 2009, Washington DC 9

10 Original PIPER Program Flow On host On GPU Perform once Phase File I/O Receptor and Ligand Ligand Rotation File I/O Parameters, Rotation and Weights Grid Assignment Determination of padded FFT grid size FFT of ligand grids Modulation of grid-pairs Receptor grid assignment for different energy functions IFFT of modulated grids Forward FFT of receptor grids - (P + 4) FFTs Accumulation of desolvation terms Complex conjugate of FFT grids Scoring and Filtering Creation of ligand grids for different energy functions Total runtime per rotation Run Time (sec) Repeat for each rotation Ligand rotation and grid assignment 0.00 Repeat for each of (P + 4) grids 0.23 Forward FFT of ligand Modulation of transformed receptor and ligand grids 4.51 Inverse FFT of modulated grid 0.24 For pairwise potential only: Accumulation of different terms Scoring and filtering % total 0% 2.3% 45.4% 2.2% 45.4% 2.4% 2.3% 100% Best Fit 3/20/2009 GPGPU 2009, Washington DC 10

11 Mapping PIPER to GPU Correlation Direct correlation FFT Correlation FFT IFFT Modulation Accumulation of desolvation terms Scoring and Filtering Rotation and Grid assignment Latency hiding 3/20/2009 GPGPU 2009, Washington DC 11

12 Direct correlation on GPU Replaces steps of FFT, Modulation and IFFT Shifting, Voxel-voxel interaction, grid summation Each multiprocessor accesses both grids Receptor grid Global memory Ligand grid Shared memory SMP Shared Memory SMP Shared Memory SMP Shared Memory Multiple correlations together For different energy functions Global Memory 3/20/2009 GPGPU 2009, Washington DC 12

13 Direct correlation on GPU Shared memory limits the ligand size With 4 pairwise term - 8 cubed ligand For larger ligand grids Store on global memory and swap Degrades performance For smaller grids - Multiple rotations For 4 cubed grid - 8 rotations together SMP Shared Memory Multiple computation per fetch 2.7x performance improvement 3/20/2009 GPGPU 2009, Washington DC 13

14 Direct correlation on GPU Distribution of work among threads 2D Plane to thread block Part of the plane to thread block Yield similar results Result grid SMP SMP SMP SMP SMP SMP SMP SMP 3/20/2009 GPGPU 2009, Washington DC 14

15 FFT Correlation on GPU Direct correlation is not attractive for large grids Multiple FFTs in serial order Using NVIDIA CUFFT library Minimize host device data transfer Perform as many steps on GPU as possible FFT / IFFT only FFT / IFFT + Mod. FFT/IFFT + Mod. + Filtering GPU GPU GPU O(N 3 ) O(N 3 ) O(N 3 ) floats 2-10 floats Host Host Host 3/20/2009 GPGPU 2009, Washington DC 15

16 FFT Correlation on GPU GPU Host FFT Modulation IFFT Scoring and Filtering Global Memory 3/20/2009 GPGPU 2009, Washington DC 16

17 Direct Correlation v/s FFT Direct Correlation FFT Correlation Good for small ligand grids Multiple rotations per iteration Good for large ligands Limits number of energy terms Runtime ligand size Provides implicit filtering Any number of energy terms Runtime padded grid size Explicit filtering required 3/20/2009 GPGPU 2009, Washington DC 17

18 PIPER Scoring and Filtering Critical for overall performance Scoring Multiple sets of weights E = Eshape + w2 Eelec + w3 E desol Filtering Regional Best 3/20/2009 GPGPU 2009, Washington DC 18

19 Scoring and Filtering on GPU Weight-sets distributed on different multiprocessors Weights stored in constant cache Multiprocessors underutilized N 3 Scores SMP K coefficients Unused Multiprocessors SMP SMP SMP SMP SMP SMP SMP SMP Naïve scheme Negative speedup N 3 Scores T 0 N 3 Global Memory Host Memory Second scheme Threads store scores in shared memory Serialization at the end - Thread 0 finds best of best - Also performs flagging of cells Other schemes possible Best Score N 3 N 3 Scores M T 0 T 1 T 2 T M-2 T M-1 Shared Memory T 0 Best Score 3/20/2009 GPGPU 2009, Washington DC 19

20 Scoring and Filtering on GPU Flagging the neighboring cells Serial PIPER: Does not fit in GPU shared memory (N3 entries) Solution 1 Exclusion index array (100 entries) 2-3x slowdown w.r.t. host filtering Solution 2 Bit array on GPU global memory One array for each set of weights Achieves speedup over host filtering (N3 entries each) /20/2009 GPGPU 2009, Washington DC 20

21 Results Speedup for different phase Phase CPU Time (ms) GPU Time (ms) Speedup Once per rotation, per energy grid Forward FFT Modulation Inverse FFT Once per rotation Accumulation of desolvation terms Scoring and Filtering For 22 grids Total runtime per rotation /20/2009 GPGPU 2009, Washington DC 21

22 Results Correlation only Speedup: FFT v/s Direct correlation Correlation only speedups (8 correlations) Speedups (log scale) GPU Direct Correlation GPU FFT Correlation FPGA Direct Correlation cubed 6 cubed 8 cubed 16 cubed 32 cubed Size of ligand grid * Baseline: FFT Correlation on single core GPU: NVIDIA TESLA C1060 FPGA: Altera Stratix III CPU: Intel Quad core 3.00 GHz 3/20/2009 GPGPU 2009, Washington DC 22

23 Results Speedup (log scale) Speedup on different architectures Ligand Docking Protein Docking Correlation only speedups (8 correlations) PIPER Overall Speedup Multicore Best(4 cores) GPU Best FPGA Direct Correlation Speedup Multicore Best(4 cores) GPU Best 30 FPGA Direct Correlation cubed 8 cubed 16 cubed 32 cubed 64 cubed Size of ligand grid 0 4 cubed 8 cubed 16 cubed 32 cubed 64 cubed Size of ligand grid * Baseline: Best Correlation on single core * Baseline: PIPER running on single core 3/20/2009 GPGPU 2009, Washington DC 23

24 Thank You

25 Extra Slides

26 Actual runtimes Results Correlation only runtimes 8 correlations Ligand grid size Serial GPU FPGA 4 cubed 3600 ms 13.5 ms 2.5 ms 8 cubed 3600 ms 170 ms 20 ms 16 cubed 3600 ms 170 ms 160 ms PIPER runtimes for 10,000 rotations 22 correlations Ligand grid size Serial GPU FPGA 4 cubed 28 hrs. 52 min 46 min 8 cubed 28 hrs. 94 min 46 min 16 cubed 28 hrs. 94 min 87 min 3/20/2009 GPGPU 2009, Washington DC 26

27 Results Direct correlation on GPU 8 correlations Runtimes for different grid and block sizes Ligand grid size Grid Size Block Size Runtime 8 cubed 16*16 8*8 16*16 16*16 32*32 8*8*8 8*8*8 4*4*4 8*8*8 4*4*4 245 ms 435 ms 461 ms 1650 ms 16 cubed 8*8 8*8* ms 2205 ms 3/20/2009 GPGPU 2009, Washington DC 27

PRACE Autumn School GPU Programming

PRACE Autumn School GPU Programming PRACE Autumn School 2010 GPU Programming October 25-29, 2010 PRACE Autumn School, Oct 2010 1 Outline GPU Programming Track Tuesday 26th GPGPU: General-purpose GPU Programming CUDA Architecture, Threading

More information

Implementation of an MPEG Codec on the Tilera TM 64 Processor

Implementation of an MPEG Codec on the Tilera TM 64 Processor 1 Implementation of an MPEG Codec on the Tilera TM 64 Processor Whitney Flohr Supervisor: Mark Franklin, Ed Richter Department of Electrical and Systems Engineering Washington University in St. Louis Fall

More information

Amdahl s Law in the Multicore Era

Amdahl s Law in the Multicore Era Amdahl s Law in the Multicore Era Mark D. Hill and Michael R. Marty University of Wisconsin Madison August 2008 @ Semiahmoo Workshop IBM s Dr. Thomas Puzak: Everyone knows Amdahl s Law 2008 Multifacet

More information

FPGA Digital Signal Processing. Derek Kozel July 15, 2017

FPGA Digital Signal Processing. Derek Kozel July 15, 2017 FPGA Digital Signal Processing Derek Kozel July 15, 2017 table of contents 1. Field Programmable Gate Arrays (FPGAs) 2. FPGA Programming Options 3. Common DSP Elements 4. RF Network on Chip 5. Applications

More information

Yong Cao, Debprakash Patnaik, Sean Ponce, Jeremy Archuleta, Patrick Butler, Wu-chun Feng, and Naren Ramakrishnan

Yong Cao, Debprakash Patnaik, Sean Ponce, Jeremy Archuleta, Patrick Butler, Wu-chun Feng, and Naren Ramakrishnan Yong Cao, Debprakash Patnaik, Sean Ponce, Jeremy Archuleta, Patrick Butler, Wu-chun Feng, and Naren Ramakrishnan Virginia Polytechnic Institute and State University Reverse-engineer the brain National

More information

CSE140L: Components and Design Techniques for Digital Systems Lab. CPU design and PLDs. Tajana Simunic Rosing. Source: Vahid, Katz

CSE140L: Components and Design Techniques for Digital Systems Lab. CPU design and PLDs. Tajana Simunic Rosing. Source: Vahid, Katz CSE140L: Components and Design Techniques for Digital Systems Lab CPU design and PLDs Tajana Simunic Rosing Source: Vahid, Katz 1 Lab #3 due Lab #4 CPU design Today: CPU design - lab overview PLDs Updates

More information

High Performance Microprocessor Design and Automation: Overview, Challenges and Opportunities IBM Corporation

High Performance Microprocessor Design and Automation: Overview, Challenges and Opportunities IBM Corporation High Performance Microprocessor Design and Automation: Overview, Challenges and Opportunities Introduction About Myself What to expect out of this lecture Understand the current trend in the IC Design

More information

DC Ultra. Concurrent Timing, Area, Power and Test Optimization. Overview

DC Ultra. Concurrent Timing, Area, Power and Test Optimization. Overview DATASHEET DC Ultra Concurrent Timing, Area, Power and Test Optimization DC Ultra RTL synthesis solution enables users to meet today s design challenges with concurrent optimization of timing, area, power

More information

Designing for High Speed-Performance in CPLDs and FPGAs

Designing for High Speed-Performance in CPLDs and FPGAs Designing for High Speed-Performance in CPLDs and FPGAs Zeljko Zilic, Guy Lemieux, Kelvin Loveless, Stephen Brown, and Zvonko Vranesic Department of Electrical and Computer Engineering University of Toronto,

More information

Computer and Machine Vision

Computer and Machine Vision Computer and Machine Vision Lecture Week 3 Part-1 January 27, 2014 Sam Siewert Outline of Week 3 Processing Images and Moving Pictures High Level View and Computer Architecture for it Linux Platforms for

More information

Ryerson University Department of Electrical and Computer Engineering COE/BME 328 Digital Systems

Ryerson University Department of Electrical and Computer Engineering COE/BME 328 Digital Systems 1 P a g e Ryerson University Department of Electrical and Computer Engineering COE/BME 328 Digital Systems Lab 6 35 Marks (3 weeks) Design of a Simple General-Purpose Processor Due Date: Week 12 Objective:

More information

Voxengo PHA-979 User Guide

Voxengo PHA-979 User Guide Version 2.6 http://www.voxengo.com/product/pha979/ Contents Introduction 3 Features 3 Compatibility 3 User Interface Elements 5 Delay 5 Phase 5 Output 6 Correlometer 7 Introduction 7 Parameters 7 Credits

More information

Microprocessor Design

Microprocessor Design Microprocessor Design Principles and Practices With VHDL Enoch O. Hwang Brooks / Cole 2004 To my wife and children Windy, Jonathan and Michelle Contents 1. Designing a Microprocessor... 2 1.1 Overview

More information

Fooling the Masses with Performance Results: Old Classics & Some New Ideas

Fooling the Masses with Performance Results: Old Classics & Some New Ideas Fooling the Masses with Performance Results: Old Classics & Some New Ideas Gerhard Wellein (1,2), Georg Hager (2) (1) Department for Computer Science (2) Erlangen Regional Computing Center Friedrich-Alexander-Universität

More information

Controlling adaptive resampling

Controlling adaptive resampling Controlling adaptive resampling Fons ADRIAENSEN, Casa della Musica, Pzle. San Francesco 1, 43000 Parma (PR), Italy, fons@linuxaudio.org Abstract Combining audio components that use incoherent sample clocks

More information

Reconfigurable Architectures. Greg Stitt ECE Department University of Florida

Reconfigurable Architectures. Greg Stitt ECE Department University of Florida Reconfigurable Architectures Greg Stitt ECE Department University of Florida How can hardware be reconfigurable? Problem: Can t change fabricated chip ASICs are fixed Solution: Create components that can

More information

New Techniques for Designing and Analyzing Multi-GigaHertz Serial Links

New Techniques for Designing and Analyzing Multi-GigaHertz Serial Links New Techniques for Designing and Analyzing Multi-GigaHertz Serial Links Min Wang, Intel Henri Maramis, Intel Donald Telian, Cadence Kevin Chung, Cadence 1 Agenda 1. Wide Eyes and More Bits 2. Interconnect

More information

Comp 410/510. Computer Graphics Spring Introduction to Graphics Systems

Comp 410/510. Computer Graphics Spring Introduction to Graphics Systems Comp 410/510 Computer Graphics Spring 2018 Introduction to Graphics Systems Computer Graphics Computer graphics deals with all aspects of 'creating images with a computer - Hardware (PC with graphics card)

More information

GALILEO Timing Receiver

GALILEO Timing Receiver GALILEO Timing Receiver The Space Technology GALILEO Timing Receiver is a triple carrier single channel high tracking performances Navigation receiver, specialized for Time and Frequency transfer application.

More information

Selective Intra Prediction Mode Decision for H.264/AVC Encoders

Selective Intra Prediction Mode Decision for H.264/AVC Encoders Selective Intra Prediction Mode Decision for H.264/AVC Encoders Jun Sung Park, and Hyo Jung Song Abstract H.264/AVC offers a considerably higher improvement in coding efficiency compared to other compression

More information

Multicore Design Considerations

Multicore Design Considerations Multicore Design Considerations Multicore: The Forefront of Computing Technology We re not going to have faster processors. Instead, making software run faster in the future will mean using parallel programming

More information

IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 19, NO. 3, MARCH GHEVC: An Efficient HEVC Decoder for Graphics Processing Units

IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 19, NO. 3, MARCH GHEVC: An Efficient HEVC Decoder for Graphics Processing Units IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 19, NO. 3, MARCH 2017 459 GHEVC: An Efficient HEVC Decoder for Graphics Processing Units Diego F. de Souza, Student Member, IEEE, Aleksandar Ilic, Member, IEEE, Nuno

More information

An Introduction to VLSI (Very Large Scale Integrated) Circuit Design

An Introduction to VLSI (Very Large Scale Integrated) Circuit Design An Introduction to VLSI (Very Large Scale Integrated) Circuit Design Presented at EE1001 Oct. 16th, 2018 By Hua Tang The first electronic computer (1946) 2 First Transistor (Bipolar) First transistor Bell

More information

ECE 4220 Real Time Embedded Systems Final Project Spectrum Analyzer

ECE 4220 Real Time Embedded Systems Final Project Spectrum Analyzer ECE 4220 Real Time Embedded Systems Final Project Spectrum Analyzer by: Matt Mazzola 12222670 Abstract The design of a spectrum analyzer on an embedded device is presented. The device achieves minimum

More information

EE5780 Advanced VLSI CAD

EE5780 Advanced VLSI CAD EE5780 Advanced VLSI CAD Lecture 11 SRAM and Yield Analysis Zhuo Feng 11.1 Memory Arrays SRAM Architecture SRAM Cell Decoders Column Circuitry Multiple Ports Outline Serial Access Memories 11.2 Memory

More information

Savant. Savant. SignalCalc. Power in Numbers input channels. Networked chassis with 1 Gigabit Ethernet to host

Savant. Savant. SignalCalc. Power in Numbers input channels. Networked chassis with 1 Gigabit Ethernet to host Power in Numbers Savant SignalCalc 40-1024 input channels Networked chassis with 1 Gigabit Ethernet to host 49 khz analysis bandwidth, all channels with simultaneous storage to disk SignalCalc Dynamic

More information

Department of Electrical & Electronic Engineering Imperial College of Science, Technology and Medicine. Project: Real-Time Speech Enhancement

Department of Electrical & Electronic Engineering Imperial College of Science, Technology and Medicine. Project: Real-Time Speech Enhancement Department of Electrical & Electronic Engineering Imperial College of Science, Technology and Medicine Project: Real-Time Speech Enhancement Introduction Telephones are increasingly being used in noisy

More information

Experimental Results of the Coaxial Multipactor Experiment. T.P. Graves, B. LaBombard, S.J. Wukitch, I.H. Hutchinson PSFC-MIT

Experimental Results of the Coaxial Multipactor Experiment. T.P. Graves, B. LaBombard, S.J. Wukitch, I.H. Hutchinson PSFC-MIT Experimental Results of the Coaxial Multipactor Experiment T.P. Graves, B. LaBombard, S.J. Wukitch, I.H. Hutchinson PSFC-MIT Summary A multipactor discharge is a resonant condition for electrons in an

More information

Optimization of Multi-Channel BCH Error Decoding for Common Cases. Russell Dill Master's Thesis Defense April 20, 2015

Optimization of Multi-Channel BCH Error Decoding for Common Cases. Russell Dill Master's Thesis Defense April 20, 2015 Optimization of Multi-Channel BCH Error Decoding for Common Cases Russell Dill Master's Thesis Defense April 20, 2015 Bose-Chaudhuri-Hocquenghem (BCH) BCH is an Error Correcting Code (ECC) and is used

More information

1ms Column Parallel Vision System and It's Application of High Speed Target Tracking

1ms Column Parallel Vision System and It's Application of High Speed Target Tracking Proceedings of the 2(X)0 IEEE International Conference on Robotics & Automation San Francisco, CA April 2000 1ms Column Parallel Vision System and It's Application of High Speed Target Tracking Y. Nakabo,

More information

DCI Requirements Image - Dynamics

DCI Requirements Image - Dynamics DCI Requirements Image - Dynamics Matt Cowan Entertainment Technology Consultants www.etconsult.com Gamma 2.6 12 bit Luminance Coding Black level coding Post Production Implications Measurement Processes

More information

ni.com Digital Signal Processing for Every Application

ni.com Digital Signal Processing for Every Application Digital Signal Processing for Every Application Digital Signal Processing is Everywhere High-Volume Image Processing Production Test Structural Sound Health and Vibration Monitoring RF WiMAX, and Microwave

More information

IEEE Santa Clara ComSoc/CAS Weekend Workshop Event-based analog sensing

IEEE Santa Clara ComSoc/CAS Weekend Workshop Event-based analog sensing IEEE Santa Clara ComSoc/CAS Weekend Workshop Event-based analog sensing Theodore Yu theodore.yu@ti.com Texas Instruments Kilby Labs, Silicon Valley Labs September 29, 2012 1 Living in an analog world The

More information

AN EFFECTIVE CACHE FOR THE ANYWHERE PIXEL ROUTER

AN EFFECTIVE CACHE FOR THE ANYWHERE PIXEL ROUTER University of Kentucky UKnowledge Theses and Dissertations--Electrical and Computer Engineering Electrical and Computer Engineering 2007 AN EFFECTIVE CACHE FOR THE ANYWHERE PIXEL ROUTER Vijai Raghunathan

More information

Transparent low-overhead checkpoint for GPU-accelerated clusters

Transparent low-overhead checkpoint for GPU-accelerated clusters Transparent low-overhead checkpoint for GPU-accelerated clusters Leonardo BAUTISTA GOMEZ 1,3, Akira NUKADA 1, Naoya MARUYAMA 1, Franck CAPPELLO 3,4, Satoshi MATSUOKA 1,2 1 Tokyo Institute of Technology,

More information

Mauricio Álvarez-Mesa ; Chi Ching Chi ; Ben Juurlink ; Valeri George ; Thomas Schierl Parallel video decoding in the emerging HEVC standard

Mauricio Álvarez-Mesa ; Chi Ching Chi ; Ben Juurlink ; Valeri George ; Thomas Schierl Parallel video decoding in the emerging HEVC standard Mauricio Álvarez-Mesa ; Chi Ching Chi ; Ben Juurlink ; Valeri George ; Thomas Schierl Parallel video decoding in the emerging HEVC standard Conference object, Postprint version This version is available

More information

Pre-5G-NR Signal Generation and Analysis Application Note

Pre-5G-NR Signal Generation and Analysis Application Note Pre-5G-NR Signal Generation and Analysis Application Note Products: R&S SMW200A R&S VSE R&S SMW-K114 R&S VSE-K96 R&S FSW R&S FSVA R&S FPS This application note shows how to use Rohde & Schwarz signal generators

More information

Calibrate, Characterize and Emulate Systems Using RFXpress in AWG Series

Calibrate, Characterize and Emulate Systems Using RFXpress in AWG Series Calibrate, Characterize and Emulate Systems Using RFXpress in AWG Series Introduction System designers and device manufacturers so long have been using one set of instruments for creating digitally modulated

More information

BER MEASUREMENT IN THE NOISY CHANNEL

BER MEASUREMENT IN THE NOISY CHANNEL BER MEASUREMENT IN THE NOISY CHANNEL PREPARATION... 2 overview... 2 the basic system... 3 a more detailed description... 4 theoretical predictions... 5 EXPERIMENT... 6 the ERROR COUNTING UTILITIES module...

More information

XC-77 (EIA), XC-77CE (CCIR)

XC-77 (EIA), XC-77CE (CCIR) XC-77 (EIA), XC-77CE (CCIR) Monochrome machine vision video camera modules. 1. Outline The XC-77/77CE is a monochrome video camera module designed for the industrial market. The camera is equipped with

More information

High Performance Carry Chains for FPGAs

High Performance Carry Chains for FPGAs High Performance Carry Chains for FPGAs Matthew M. Hosler Department of Electrical and Computer Engineering Northwestern University Abstract Carry chains are an important consideration for most computations,

More information

Upgrading a FIR Compiler v3.1.x Design to v3.2.x

Upgrading a FIR Compiler v3.1.x Design to v3.2.x Upgrading a FIR Compiler v3.1.x Design to v3.2.x May 2005, ver. 1.0 Application Note 387 Introduction This application note is intended for designers who have an FPGA design that uses the Altera FIR Compiler

More information

2. Logic Elements and Logic Array Blocks in the Cyclone III Device Family

2. Logic Elements and Logic Array Blocks in the Cyclone III Device Family December 2011 CIII51002-2.3 2. Logic Elements and Logic Array Blocks in the Cyclone III Device Family CIII51002-2.3 This chapter contains feature definitions for logic elements (LEs) and logic array blocks

More information

Switching Solutions for Multi-Channel High Speed Serial Port Testing

Switching Solutions for Multi-Channel High Speed Serial Port Testing Switching Solutions for Multi-Channel High Speed Serial Port Testing Application Note by Robert Waldeck VP Business Development, ASCOR Switching The instruments used in High Speed Serial Port testing are

More information

This paper is a preprint of a paper accepted by Electronics Letters and is subject to Institution of Engineering and Technology Copyright.

This paper is a preprint of a paper accepted by Electronics Letters and is subject to Institution of Engineering and Technology Copyright. This paper is a preprint of a paper accepted by Electronics Letters and is subject to Institution of Engineering and Technology Copyright. The final version is published and available at IET Digital Library

More information

L12: Reconfigurable Logic Architectures

L12: Reconfigurable Logic Architectures L12: Reconfigurable Logic Architectures Acknowledgements: Materials in this lecture are courtesy of the following sources and are used with permission. Frank Honore Prof. Randy Katz (Unified Microelectronics

More information

Image Acquisition Technology

Image Acquisition Technology Image Choosing the Right Image Acquisition Technology A Machine Vision White Paper 1 Today, machine vision is used to ensure the quality of everything from tiny computer chips to massive space vehicles.

More information

Generalized Pattern Matching Micro-Engine

Generalized Pattern Matching Micro-Engine Generalized Pattern Matching Micro-Engine Yuanwei Fang*, Raihan Rasool, Dilip Vasudevan*, Andrew A. Chien* University of Chicago * Argonne National Laboratory King Faisal University Big Data Applications

More information

SPATIAL LIGHT MODULATORS

SPATIAL LIGHT MODULATORS SPATIAL LIGHT MODULATORS Reflective XY Series Phase and Amplitude 512x512 A spatial light modulator (SLM) is an electrically programmable device that modulates light according to a fixed spatial (pixel)

More information

Data Converters and DSPs Getting Closer to Sensors

Data Converters and DSPs Getting Closer to Sensors Data Converters and DSPs Getting Closer to Sensors As the data converters used in military applications must operate faster and at greater resolution, the digital domain is moving closer to the antenna/sensor

More information

Placement Rent Exponent Calculation Methods, Temporal Behaviour, and FPGA Architecture Evaluation. Joachim Pistorius and Mike Hutton

Placement Rent Exponent Calculation Methods, Temporal Behaviour, and FPGA Architecture Evaluation. Joachim Pistorius and Mike Hutton Placement Rent Exponent Calculation Methods, Temporal Behaviour, and FPGA Architecture Evaluation Joachim Pistorius and Mike Hutton Some Questions How best to calculate placement Rent? Are there biases

More information

LOW POWER DIGITAL EQUALIZATION FOR HIGH SPEED SERDES. Masum Hossain University of Alberta

LOW POWER DIGITAL EQUALIZATION FOR HIGH SPEED SERDES. Masum Hossain University of Alberta LOW POWER DIGITAL EQUALIZATION FOR HIGH SPEED SERDES Masum Hossain University of Alberta 0 Outline Why ADC-Based receiver? Challenges in ADC-based receiver ADC-DSP based Receiver Reducing impact of Quantization

More information

DATA COMPRESSION USING THE FFT

DATA COMPRESSION USING THE FFT EEE 407/591 PROJECT DUE: NOVEMBER 21, 2001 DATA COMPRESSION USING THE FFT INSTRUCTOR: DR. ANDREAS SPANIAS TEAM MEMBERS: IMTIAZ NIZAMI - 993 21 6600 HASSAN MANSOOR - 993 69 3137 Contents TECHNICAL BACKGROUND...

More information

Supervision of Analogue Signal Paths in Legacy Media Migration Processes using Digital Signal Processing

Supervision of Analogue Signal Paths in Legacy Media Migration Processes using Digital Signal Processing Welcome Supervision of Analogue Signal Paths in Legacy Media Migration Processes using Digital Signal Processing Jörg Houpert Cube-Tec International Oslo, Norway 4th May, 2010 Joint Technical Symposium

More information

Pivoting Object Tracking System

Pivoting Object Tracking System Pivoting Object Tracking System [CSEE 4840 Project Design - March 2009] Damian Ancukiewicz Applied Physics and Applied Mathematics Department da2260@columbia.edu Jinglin Shen Electrical Engineering Department

More information

System Quality Indicators

System Quality Indicators Chapter 2 System Quality Indicators The integration of systems on a chip, has led to a revolution in the electronic industry. Large, complex system functions can be integrated in a single IC, paving the

More information

REAL-TIME H.264 ENCODING BY THREAD-LEVEL PARALLELISM: GAINS AND PITFALLS

REAL-TIME H.264 ENCODING BY THREAD-LEVEL PARALLELISM: GAINS AND PITFALLS REAL-TIME H.264 ENCODING BY THREAD-LEVEL ARALLELISM: GAINS AND ITFALLS Guy Amit and Adi inhas Corporate Technology Group, Intel Corp 94 Em Hamoshavot Rd, etah Tikva 49527, O Box 10097 Israel {guy.amit,

More information

ELEC 204 Digital System Design LABORATORY MANUAL

ELEC 204 Digital System Design LABORATORY MANUAL Elec 24: Digital System Design Laboratory ELEC 24 Digital System Design LABORATORY MANUAL : 4-bit hexadecimal Decoder & 4-bit Increment by N Circuit College of Engineering Koç University Important Note:

More information

Modeling and simulation of altera logic array block using quantum-dot cellular automata

Modeling and simulation of altera logic array block using quantum-dot cellular automata The University of Toledo The University of Toledo Digital Repository Theses and Dissertations 2011 Modeling and simulation of altera logic array block using quantum-dot cellular automata Rohan Kapkar The

More information

L11/12: Reconfigurable Logic Architectures

L11/12: Reconfigurable Logic Architectures L11/12: Reconfigurable Logic Architectures Acknowledgements: Materials in this lecture are courtesy of the following people and used with permission. - Randy H. Katz (University of California, Berkeley,

More information

Lecture 2: Basic FPGA Fabric. James C. Hoe Department of ECE Carnegie Mellon University

Lecture 2: Basic FPGA Fabric. James C. Hoe Department of ECE Carnegie Mellon University 18 643 Lecture 2: Basic FPGA Fabric James. Hoe Department of EE arnegie Mellon University 18 643 F17 L02 S1, James. Hoe, MU/EE/ALM, 2017 Housekeeping Your goal today: know enough to build a basic FPGA

More information

Artisan Technology Group is your source for quality new and certified-used/pre-owned equipment

Artisan Technology Group is your source for quality new and certified-used/pre-owned equipment Artisan Technology Group is your source for quality new and certified-used/pre-owned equipment FAST SHIPPING AND DELIVERY TENS OF THOUSANDS OF IN-STOCK ITEMS EQUIPMENT DEMOS HUNDREDS OF MANUFACTURERS SUPPORTED

More information

FPGA Development for Radar, Radio-Astronomy and Communications

FPGA Development for Radar, Radio-Astronomy and Communications John-Philip Taylor Room 7.03, Department of Electrical Engineering, Menzies Building, University of Cape Town Cape Town, South Africa 7701 Tel: +27 82 354 6741 email: tyljoh010@myuct.ac.za Internet: http://www.uct.ac.za

More information

WiBench: An Open Source Kernel Suite for Benchmarking Wireless Systems

WiBench: An Open Source Kernel Suite for Benchmarking Wireless Systems 1 WiBench: An Open Source Kernel Suite for Benchmarking Wireless Systems Qi Zheng*, Yajing Chen*, Ronald Dreslinski*, Chaitali Chakrabarti +, Achilleas Anastasopoulos*, Scott Mahlke*, Trevor Mudge* *,

More information

Data flow architecture for high-speed optical processors

Data flow architecture for high-speed optical processors Data flow architecture for high-speed optical processors Kipp A. Bauchert and Steven A. Serati Boulder Nonlinear Systems, Inc., Boulder CO 80301 1. Abstract For optical processor applications outside of

More information

CS 61C: Great Ideas in Computer Architecture

CS 61C: Great Ideas in Computer Architecture CS 6C: Great Ideas in Computer Architecture Combinational and Sequential Logic, Boolean Algebra Instructor: Alan Christopher 7/23/24 Summer 24 -- Lecture #8 Review of Last Lecture OpenMP as simple parallel

More information

A Proof of Concept - Challenges of testing high-speed interface on wafer at lower cost

A Proof of Concept - Challenges of testing high-speed interface on wafer at lower cost A Proof of Concept - Challenges of testing high-speed interface on wafer at lower cost How to expand the bandwidth of the cantilever probe card Sony LSI Design Inc. Introduction Design & Simulation PCB

More information

Radiology Physics Lectures: Computers. Associate Professor, Radiology x d

Radiology Physics Lectures: Computers. Associate Professor, Radiology x d COMPUTERS IN MEDICAL IMAGING David Hall, Ph.D. DABR Associate Professor, Radiology x20893 dhll@ djhall@ucsd.edud d 1 introduced into medical imaging in the early 1970 s essential to many modalities X-ray

More information

FAST MOBILITY PARTICLE SIZER SPECTROMETER MODEL 3091

FAST MOBILITY PARTICLE SIZER SPECTROMETER MODEL 3091 FAST MOBILITY PARTICLE SIZER SPECTROMETER MODEL 3091 MEASURES SIZE DISTRIBUTION AND NUMBER CONCENTRATION OF RAPIDLY CHANGING SUBMICROMETER AEROSOL PARTICLES IN REAL-TIME UNDERSTANDING, ACCELERATED IDEAL

More information

Solutions to Embedded System Design Challenges Part II

Solutions to Embedded System Design Challenges Part II Solutions to Embedded System Design Challenges Part II Time-Saving Tips to Improve Productivity In Embedded System Design, Validation and Debug Hi, my name is Mike Juliana. Welcome to today s elearning.

More information

Masters of Science in COMPUTER ENGINEERING

Masters of Science in COMPUTER ENGINEERING PICSEL: Measuring User-Perceived Performance to Control Dynamic Frequency Scaling IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF Masters of Science in COMPUTER ENGINEERING By Jack Cosgrove

More information

A Fast Constant Coefficient Multiplier for the XC6200

A Fast Constant Coefficient Multiplier for the XC6200 A Fast Constant Coefficient Multiplier for the XC6200 Tom Kean, Bernie New and Bob Slous Xilinx Inc. Abstract. We discuss the design of a high performance constant coefficient multiplier on the Xilinx

More information

COE328 Course Outline. Fall 2007

COE328 Course Outline. Fall 2007 COE28 Course Outline Fall 2007 1 Objectives This course covers the basics of digital logic circuits and design. Through the basic understanding of Boolean algebra and number systems it introduces the student

More information

an entire Radio station for only $100 per month the Next Generation of the #1 Satellite Automation system the XTREME Solutions Program

an entire Radio station for only $100 per month the Next Generation of the #1 Satellite Automation system the XTREME Solutions Program CUE PGM CUE PGM EXT CH 16 EXT CH 16 REC CH 13 CH 14 CH 15 REC CH13 CH 14 CH15 CH 10 CH 11 CH 12 CH10 CH 11 CH 12 CH 7 CH 8 CH 9 CH9 CH7 CH 8 2 CH 4 CH 5 CH 6 2 1 CH 4 CH 5 CH 6 1 CH 1 CH 2 CH 3 CH 1 CH

More information

Lossless Compression Algorithms for Direct- Write Lithography Systems

Lossless Compression Algorithms for Direct- Write Lithography Systems Lossless Compression Algorithms for Direct- Write Lithography Systems Hsin-I Liu Video and Image Processing Lab Department of Electrical Engineering and Computer Science University of California at Berkeley

More information

ECEN689: Special Topics in High-Speed Links Circuits and Systems Spring 2011

ECEN689: Special Topics in High-Speed Links Circuits and Systems Spring 2011 ECEN689: Special Topics in High-Speed Links Circuits and Systems Spring 2011 Lecture 9: TX Multiplexer Circuits Sam Palermo Analog & Mixed-Signal Center Texas A&M University Announcements & Agenda Next

More information

Transitioning from NTSC (analog) to HD Digital Video

Transitioning from NTSC (analog) to HD Digital Video To Place an Order or get more info. Call Uniforce Sales and Engineering (510) 657 4000 www.uniforcesales.com Transitioning from NTSC (analog) to HD Digital Video Sheet 1 NTSC Analog Video NTSC video -color

More information

High-Efficient Parallel CAVLC Encoders on Heterogeneous Multicore Architectures

High-Efficient Parallel CAVLC Encoders on Heterogeneous Multicore Architectures 46 H. Y. SU, M. WEN, J. REN, N. WU, J. CHAI, C.Y. ZHANG, HIGH-EFFICIENT PARALLEL CAVLC ENCODER High-Efficient Parallel CAVLC Encoders on Heterogeneous Multicore Architectures Huayou SU, Mei WEN, Ju REN,

More information

National Park Service Photo. Utah 400 Series 1. Digital Routing Switcher.

National Park Service Photo. Utah 400 Series 1. Digital Routing Switcher. National Park Service Photo Utah 400 Series 1 Digital Routing Switcher Utah Scientific has been involved in the design and manufacture of routing switchers for audio and video signals for over thirty years.

More information

Highly Parallel HEVC Decoding for Heterogeneous Systems with CPU and GPU

Highly Parallel HEVC Decoding for Heterogeneous Systems with CPU and GPU 2017. This manuscript version (accecpted manuscript) is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/. Highly Parallel HEVC Decoding for Heterogeneous

More information

Why Engineers Ignore Cable Loss

Why Engineers Ignore Cable Loss Why Engineers Ignore Cable Loss By Brig Asay, Agilent Technologies Companies spend large amounts of money on test and measurement equipment. One of the largest purchases for high speed designers is a real

More information

MULTI-CORE SOFTWARE ARCHITECTURE FOR THE SCALABLE HEVC DECODER. Wassim Hamidouche, Mickael Raulet and Olivier Déforges

MULTI-CORE SOFTWARE ARCHITECTURE FOR THE SCALABLE HEVC DECODER. Wassim Hamidouche, Mickael Raulet and Olivier Déforges 2014 IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP) MULTI-CORE SOFTWARE ARCHITECTURE FOR THE SCALABLE HEVC DECODER Wassim Hamidouche, Mickael Raulet and Olivier Déforges

More information

HD-SDI Express User Training. J.Egri 4/09 1

HD-SDI Express User Training. J.Egri 4/09 1 HD-SDI Express User Training J.Egri 4/09 1 Features SDI interface Supports 720p, 1080i and 1080p formats. Supports SMPTE 292M serial interface operating at 1.485 Gbps. Supports SMPTE 274M and 296M framing.

More information

PrepSKA WP2 Meeting Software and Computing. Duncan Hall 2011-October-19

PrepSKA WP2 Meeting Software and Computing. Duncan Hall 2011-October-19 PrepSKA WP2 Meeting Software and Computing Duncan Hall 2011-October-19 Imaging context 1 of 2: 2 Imaging context 2 of 2: 3 Agenda: - Progress since 2010 October - CoDR approach and expectations - Presentation

More information

Fa m i l y o f PXI Do w n c o n v e r t e r Mo d u l e s Br i n g s 26.5 GHz RF/MW

Fa m i l y o f PXI Do w n c o n v e r t e r Mo d u l e s Br i n g s 26.5 GHz RF/MW page 1 of 6 Fa m i l y o f PXI Do w n c o n v e r t e r Mo d u l e s Br i n g s 26.5 GHz RF/MW Measurement Technology to the PXI Platform by Michael N. Granieri, Ph.D. Background: The PXI platform is known

More information

CHARACTERIZATION OF END-TO-END DELAYS IN HEAD-MOUNTED DISPLAY SYSTEMS

CHARACTERIZATION OF END-TO-END DELAYS IN HEAD-MOUNTED DISPLAY SYSTEMS CHARACTERIZATION OF END-TO-END S IN HEAD-MOUNTED DISPLAY SYSTEMS Mark R. Mine University of North Carolina at Chapel Hill 3/23/93 1. 0 INTRODUCTION This technical report presents the results of measurements

More information

MAutoPitch. Presets button. Left arrow button. Right arrow button. Randomize button. Save button. Panic button. Settings button

MAutoPitch. Presets button. Left arrow button. Right arrow button. Randomize button. Save button. Panic button. Settings button MAutoPitch Presets button Presets button shows a window with all available presets. A preset can be loaded from the preset window by double-clicking on it, using the arrow buttons or by using a combination

More information

Altera's 28-nm FPGAs Optimized for Broadcast Video Applications

Altera's 28-nm FPGAs Optimized for Broadcast Video Applications Altera's 28-nm FPGAs Optimized for Broadcast Video Applications WP-01163-1.0 White Paper This paper describes how Altera s 40-nm and 28-nm FPGAs are tailored to help deliver highly-integrated, HD studio

More information

Pulsed Klystrons for Next Generation Neutron Sources Edward L. Eisen - CPI, Inc. Palo Alto, CA, USA

Pulsed Klystrons for Next Generation Neutron Sources Edward L. Eisen - CPI, Inc. Palo Alto, CA, USA Pulsed Klystrons for Next Generation Neutron Sources Edward L. Eisen - CPI, Inc. Palo Alto, CA, USA Abstract The U.S. Department of Energy (DOE) Office of Science has funded the construction of a new accelerator-based

More information

FULL-AUTOMATIC DJ MIXING SYSTEM WITH OPTIMAL TEMPO ADJUSTMENT BASED ON MEASUREMENT FUNCTION OF USER DISCOMFORT

FULL-AUTOMATIC DJ MIXING SYSTEM WITH OPTIMAL TEMPO ADJUSTMENT BASED ON MEASUREMENT FUNCTION OF USER DISCOMFORT 10th International Society for Music Information Retrieval Conference (ISMIR 2009) FULL-AUTOMATIC DJ MIXING SYSTEM WITH OPTIMAL TEMPO ADJUSTMENT BASED ON MEASUREMENT FUNCTION OF USER DISCOMFORT Hiromi

More information

Figure 1: Feature Vector Sequence Generator block diagram.

Figure 1: Feature Vector Sequence Generator block diagram. 1 Introduction Figure 1: Feature Vector Sequence Generator block diagram. We propose designing a simple isolated word speech recognition system in Verilog. Our design is naturally divided into two modules.

More information

Commsonic. Satellite FEC Decoder CMS0077. Contact information

Commsonic. Satellite FEC Decoder CMS0077. Contact information Satellite FEC Decoder CMS0077 Fully compliant with ETSI EN-302307-1 / -2. The IP core accepts demodulated digital IQ inputs and is designed to interface directly with the CMS0059 DVB-S2 / DVB-S2X Demodulator

More information

3-D position sensitive CdZnTe gamma-ray spectrometers

3-D position sensitive CdZnTe gamma-ray spectrometers Nuclear Instruments and Methods in Physics Research A 422 (1999) 173 178 3-D position sensitive CdZnTe gamma-ray spectrometers Z. He *, W.Li, G.F. Knoll, D.K. Wehe, J. Berry, C.M. Stahle Department of

More information

OF AN ADVANCED LUT METHODOLOGY BASED FIR FILTER DESIGN PROCESS

OF AN ADVANCED LUT METHODOLOGY BASED FIR FILTER DESIGN PROCESS IMPLEMENTATION OF AN ADVANCED LUT METHODOLOGY BASED FIR FILTER DESIGN PROCESS 1 G. Sowmya Bala 2 A. Rama Krishna 1 PG student, Dept. of ECM. K.L.University, Vaddeswaram, A.P, India, 2 Assistant Professor,

More information

3/5/2017. A Register Stores a Set of Bits. ECE 120: Introduction to Computing. Add an Input to Control Changing a Register s Bits

3/5/2017. A Register Stores a Set of Bits. ECE 120: Introduction to Computing. Add an Input to Control Changing a Register s Bits University of Illinois at Urbana-Champaign Dept. of Electrical and Computer Engineering ECE 120: Introduction to Computing Registers A Register Stores a Set of Bits Most of our representations use sets

More information

Experiment 7: Bit Error Rate (BER) Measurement in the Noisy Channel

Experiment 7: Bit Error Rate (BER) Measurement in the Noisy Channel Experiment 7: Bit Error Rate (BER) Measurement in the Noisy Channel Modified Dr Peter Vial March 2011 from Emona TIMS experiment ACHIEVEMENTS: ability to set up a digital communications system over a noisy,

More information

Basic rules for the design of RF Controls in High Intensity Proton Linacs. Particularities of proton linacs wrt electron linacs

Basic rules for the design of RF Controls in High Intensity Proton Linacs. Particularities of proton linacs wrt electron linacs Basic rules Basic rules for the design of RF Controls in High Intensity Proton Linacs Particularities of proton linacs wrt electron linacs Non-zero synchronous phase needs reactive beam-loading compensation

More information

Spatial Light Modulators XY Series

Spatial Light Modulators XY Series Spatial Light Modulators XY Series Phase and Amplitude 512x512 and 256x256 A spatial light modulator (SLM) is an electrically programmable device that modulates light according to a fixed spatial (pixel)

More information

Tutorial on Technical and Performance Benefits of AD719x Family

Tutorial on Technical and Performance Benefits of AD719x Family The World Leader in High Performance Signal Processing Solutions Tutorial on Technical and Performance Benefits of AD719x Family AD7190, AD7191, AD7192, AD7193, AD7194, AD7195 This slide set focuses on

More information

COMP 249 Advanced Distributed Systems Multimedia Networking. Video Compression Standards

COMP 249 Advanced Distributed Systems Multimedia Networking. Video Compression Standards COMP 9 Advanced Distributed Systems Multimedia Networking Video Compression Standards Kevin Jeffay Department of Computer Science University of North Carolina at Chapel Hill jeffay@cs.unc.edu September,

More information