Scalability of MB-level Parallelism for H.264 Decoding
|
|
- Eric Roberts
- 6 years ago
- Views:
Transcription
1 Scalability of Macroblock-level Parallelism for H.264 Decoding Mauricio Alvarez Mesa 1, Alex Ramírez 1,2, Mateo Valero 1,2, Arnaldo Azevedo 3, Cor Meenderinck 3, Ben Juurlink 3 1 Universitat Politècnica de Catalunya (UPC). Barcelona. Spain 2 Barcelona Supercomputing Center (BSC). Barcelona. Spain 3 Delft University of Technology (TUD). Delft. The Netherlands December 15, 2009
2 Outline Introduction 1 Introduction 2 3 4
3 Trends in digital video Towards high quality systems High definition video and high quality video codecs High computational complexity Towards Mobile and Integrated systems Convergence of mobile and multimedia systems Real-time, area and power constraints. Towards multiple formats and extensions: H.264 extensions and different video codecs (MPEG-2, VC-1) Programmable processors instead of application specific hardware
4 Trends in multicore computer architecture Towards manycore systems Hundreds of cores on a chip Power and complexity wall: simpler cores Massive Thread Level Parallelism Towards heterogeneous/asymmetric architectures Asymmetric cores: same ISA but different performance Heterogeneous cores: accelerators for different application domains Specialized architectures have better performance/power/area benefits
5 Challenges of video applications in the multicore era Requirements of digital video applications: Performance: High quality video translates in high computational complexity Efficiency: Embedded environments impose real-time and power constraints. Flexibility: Multiple video formats and new extensions requires programmability. Opportunities of multicore architectures Scalability Applications can benefit from multicores only if they can be parallelized
6 Outline Introduction H.264 video decoder Macroblock-level parallelism Theoretical Maximum Speed-up Abstract Trace-driven Simulation 1 Introduction 2 H.264 video decoder Macroblock-level parallelism Theoretical Maximum Speed-up Abstract Trace-driven Simulation 3 4
7 H.264 Decoder Introduction H.264 video decoder Macroblock-level parallelism Theoretical Maximum Speed-up Abstract Trace-driven Simulation Block based: MBs are the basic coding unit. Hybrid: motion compensation + transform coding (DCT)
8 MacroBlock-level parallelism H.264 video decoder Macroblock-level parallelism Theoretical Maximum Speed-up Abstract Trace-driven Simulation 2D-wave processing order satisfies MB dependencies and allows to exploit TLP. Scalability depends on frame resolution
9 Parallel model Introduction H.264 video decoder Macroblock-level parallelism Theoretical Maximum Speed-up Abstract Trace-driven Simulation Each frame in a video sequence can be represented with a finite Directed Acyclic Graph (DAG): Each node in the DAG represents the decoding of one MB by one processor.
10 Theoretical maximum performance 1 H.264 video decoder Macroblock-level parallelism Theoretical Maximum Speed-up Abstract Trace-driven Simulation Parallel macroblocks Time The maximum speedup can not be reached because: MB processing time is variable and input dependent Thread synchronization time is not negligible load unbalance and synchronization overhead 1 Meenderinck et al. Parallel Scalability of Video Decoders
11 Effects of variable decoding time H.264 video decoder Macroblock-level parallelism Theoretical Maximum Speed-up Abstract Trace-driven Simulation constant time blue_sky pedestrian riverbed rush_hour 26 speedup Number of frames Average speedup reduction: 33% Actual performance depends on input content
12 H.264 video decoder Macroblock-level parallelism Theoretical Maximum Speed-up Abstract Trace-driven Simulation Effects of thread synchronization overhead blue_sky pedestrian riverbed rush_hour speedup Overhead as a factor of MB decoding time Speedup reduction: 38%, when overhead = MB decoding time Observed overhead is greather than MB decoding time
13 Outline Introduction Experimental Platform Performance Analysis Removing the bottlenecks 1 Introduction 2 3 Experimental Platform Performance Analysis Removing the bottlenecks 4
14 Parallel architecture: SGI Altix Experimental Platform Performance Analysis Removing the bottlenecks Base module: 2 dual core Intel Itanium2 2 Distributed Shared Memory (cc-numa) 2 Rusu S., Circuit Technologies for Multi-Core Processor Design
15 Benchmark: HD-VideoBench 3 Experimental Platform Performance Analysis Removing the bottlenecks HD-VideoBench Test sequences: Full High Definition (FHD), 100 frames, 25 fps. H.264 decoder: FFmpeg modified for MB-level parallelization. 3
16 Programming model Experimental Platform Performance Analysis Removing the bottlenecks Single Program Multiple Data: SPMD N+2 threads: 1 master, 1 CABAC, N workers Task pool for dynamic load balancing
17 Scheduling strategies Experimental Platform Performance Analysis Removing the bottlenecks Static scheduling Master thread: Checks the dependencies and inserts work in the task queue. Worker threads: Process tasks and update dependencies Dynamic scheduling Master thread: Inserts the first MB in the task queue and waits for the last MB. Worker threads: take work from the task queue, process tasks, update dependencies and insert ready MBs in the task queue Tail-submit: If at least one MB is ready process it directly
18 Speedup and scalability Experimental Platform Performance Analysis Removing the bottlenecks tail submit right-first tail submit down-left-first static scheduling dynamic scheduling Average Speedup Number of threads Static scheduling: load unbalance Dynamic scheduling: suffers from synchronization overhead Tail submit: reduces sync. overhead and exploits data locality
19 Profiling analysis Introduction Experimental Platform Performance Analysis Removing the bottlenecks Sync. overhead ratio [factor of MB decoding time] dynamic_scheduling_wo_tailsubmit dynamic_scheduling_w_tailsubmit Number of threads Significant reduction of synchronization overhead Submitting new tasks to the task queue is the main source of overhead
20 Experimental Platform Performance Analysis Removing the bottlenecks Impact of the CABAC entropy decoder control_thread hl_decode_mb decode_cabac Execution Time [us/frame] Number of threads CABAC should be executed sequentially. CABAC execution time behavior a side effect of the cc-numa architecture.
21 Experimental Platform Performance Analysis Removing the bottlenecks Identifying the acceleration requirements A scalable MB-level parallelization requires: Remove the CABAC bottleneck Low latency synchronization primitives. These limiting factors offer a potential for multicore acceleration Multicore acceleration evaluation: Dedicated and accelerated CABAC processor On-chip hardware supported synchronization Evaluated using a fast trace-driven multicore simulation
22 Experimental Platform Performance Analysis Removing the bottlenecks Accelerating CABAC entropy decoding 25 Speed-up Frames per second Number of processors (+1 master) cabac- 1.0X cabac- 1.5X cabac- 2.0X cabac- 3.0X cabac- 4.0X cabac- 5.0X cabac-10.0x real-time 25 fps real-time 50 fps real-time 100fps FHD 25 fps: CABAC 1X, 7 worker processors FHD 50 fps: CABAC 1.5X, 16 worker processors FHD 100 fps: not enough parallelism
23 Accelerating thread synchronization Experimental Platform Performance Analysis Removing the bottlenecks 25 Speed-up Frames per second Number of processors (+1 master) 1ns 10ns 100ns 500ns 1000ns 5000ns 10000ns sync-altix-1p sync-altix-sw real-time 25 fps real-time 50 fps real-time 100fps Altix sync. time: ns; without contention: ns FHD 25 fps : sync. latency 500 ns, 7 workers FHD 50 fps : sync. latency 100 ns, 16 workers
24 Outline Introduction Backup slides 1 Introduction Backup slides
25 Introduction Backup slides Limitations to scalability load unbalance synchronization overhead CABAC sequential bottleneck Implementation on a cc-numa machine Best scheduling strategy: dynamic scheduling + Tail-submit Acceleration potential Estimation of the required CABAC acceleration Limits of latency of thread synchronization
26 Acknowledgements Introduction Backup slides This work has been supported by: HiPEAC. European Network of Excellence on High Performance and Embedded Architecture and Compilation The European Commission in the context of the SARC project (contract no ) The Spanish Ministry of Education (contract no. TIN ).
27 Trace-driven DAG simulator Backup slides DAG Simulator Creates the DAG for each frame in the video using real execution traces Calculates the Task Processing Time (TPT) of every node as: TPT (n) = w n + s n + MAX (TFT (pr n )) (1) w n : the time required to process the task s n : the time required for thread synchronization; MAX (TFT (pr n ) is the maximum task finish time (TFT) of the immediate predecessors tasks of that task.
28 Backup slides Base architecture: dual core itanium2 processor Intel Itanium2 processor 1,6GHz, 90nm 16 KB I-L1, 16 KB D-L1 cache per core 1MB I-L2, 256 KB D-L2 cache per core Shared 8MB (I+D)-L3 8 GB of RAM
A Highly Scalable Parallel Implementation of H.264
A Highly Scalable Parallel Implementation of H.264 Arnaldo Azevedo 1, Ben Juurlink 1, Cor Meenderinck 1, Andrei Terechko 2, Jan Hoogerbrugge 3, Mauricio Alvarez 4, Alex Ramirez 4,5, Mateo Valero 4,5 1
More informationMauricio Álvarez-Mesa ; Chi Ching Chi ; Ben Juurlink ; Valeri George ; Thomas Schierl Parallel video decoding in the emerging HEVC standard
Mauricio Álvarez-Mesa ; Chi Ching Chi ; Ben Juurlink ; Valeri George ; Thomas Schierl Parallel video decoding in the emerging HEVC standard Conference object, Postprint version This version is available
More informationChi Ching Chi, Ben Juurlink A QHD-capable parallel H.264 decoder
Powered by TCPDF (www.tcpdf.org) Chi Ching Chi, Ben Juurlink A QHD-capable parallel H.264 decoder Conference Object, Postprint version This version is available at http://dx.doi.org/1.14279/depositonce-634
More informationImplementation of an MPEG Codec on the Tilera TM 64 Processor
1 Implementation of an MPEG Codec on the Tilera TM 64 Processor Whitney Flohr Supervisor: Mark Franklin, Ed Richter Department of Electrical and Systems Engineering Washington University in St. Louis Fall
More informationPRACE Autumn School GPU Programming
PRACE Autumn School 2010 GPU Programming October 25-29, 2010 PRACE Autumn School, Oct 2010 1 Outline GPU Programming Track Tuesday 26th GPGPU: General-purpose GPU Programming CUDA Architecture, Threading
More informationHEVC Real-time Decoding
HEVC Real-time Decoding Benjamin Bross a, Mauricio Alvarez-Mesa a,b, Valeri George a, Chi-Ching Chi a,b, Tobias Mayer a, Ben Juurlink b, and Thomas Schierl a a Image Processing Department, Fraunhofer Institute
More informationAmdahl s Law in the Multicore Era
Amdahl s Law in the Multicore Era Mark D. Hill and Michael R. Marty University of Wisconsin Madison August 2008 @ Semiahmoo Workshop IBM s Dr. Thomas Puzak: Everyone knows Amdahl s Law 2008 Multifacet
More informationAdvanced System LSIs for Home 3D Systems
ASP-DAC2011 Session 8D-1 Advanced System LSIs for Home 3D Systems January 28, 2011 Takao Suzuki Panasonic Corporation Strategic Semiconductor Development Center Agenda 1. Overview of 3D Systems - Principles
More informationA parallel HEVC encoder scheme based on Multi-core platform Shu Jun1,2,3,a, Hu Dong1,2,3,b
4th National Conference on Electrical, Electronics and Computer Engineering (NCEECE 2015) A parallel HEVC encoder scheme based on Multi-core platform Shu Jun1,2,3,a, Hu Dong1,2,3,b 1 Education Ministry
More informationH.264/AVC Baseline Profile Decoder Complexity Analysis
704 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 13, NO. 7, JULY 2003 H.264/AVC Baseline Profile Decoder Complexity Analysis Michael Horowitz, Anthony Joch, Faouzi Kossentini, Senior
More informationMULTI-CORE SOFTWARE ARCHITECTURE FOR THE SCALABLE HEVC DECODER. Wassim Hamidouche, Mickael Raulet and Olivier Déforges
2014 IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP) MULTI-CORE SOFTWARE ARCHITECTURE FOR THE SCALABLE HEVC DECODER Wassim Hamidouche, Mickael Raulet and Olivier Déforges
More informationConference object, Postprint version This version is available at
Benjamin Bross, Valeri George, Mauricio Alvarez-Mesay, Tobias Mayer, Chi Ching Chi, Jens Brandenburg, Thomas Schierl, Detlev Marpe, Ben Juurlink HEVC performance and complexity for K video Conference object,
More informationA High-Performance Parallel CAVLC Encoder on a Fine-Grained Many-core System
A High-Performance Parallel CAVLC Encoder on a Fine-Grained Many-core System Zhibin Xiao and Bevan M. Baas VLSI Computation Lab, ECE Department University of California, Davis Outline Introduction to H.264
More informationHigh Performance Microprocessor Design and Automation: Overview, Challenges and Opportunities IBM Corporation
High Performance Microprocessor Design and Automation: Overview, Challenges and Opportunities Introduction About Myself What to expect out of this lecture Understand the current trend in the IC Design
More informationREAL-TIME H.264 ENCODING BY THREAD-LEVEL PARALLELISM: GAINS AND PITFALLS
REAL-TIME H.264 ENCODING BY THREAD-LEVEL ARALLELISM: GAINS AND ITFALLS Guy Amit and Adi inhas Corporate Technology Group, Intel Corp 94 Em Hamoshavot Rd, etah Tikva 49527, O Box 10097 Israel {guy.amit,
More informationYong Cao, Debprakash Patnaik, Sean Ponce, Jeremy Archuleta, Patrick Butler, Wu-chun Feng, and Naren Ramakrishnan
Yong Cao, Debprakash Patnaik, Sean Ponce, Jeremy Archuleta, Patrick Butler, Wu-chun Feng, and Naren Ramakrishnan Virginia Polytechnic Institute and State University Reverse-engineer the brain National
More informationA HIGH THROUGHPUT CABAC ALGORITHM USING SYNTAX ELEMENT PARTITIONING. Vivienne Sze Anantha P. Chandrakasan 2009 ICIP Cairo, Egypt
A HIGH THROUGHPUT CABAC ALGORITHM USING SYNTAX ELEMENT PARTITIONING Vivienne Sze Anantha P. Chandrakasan 2009 ICIP Cairo, Egypt Motivation High demand for video on mobile devices Compressionto reduce storage
More informationOddCI: On-Demand Distributed Computing Infrastructure
OddCI: On-Demand Distributed Computing Infrastructure Rostand Costa Francisco Brasileiro Guido Lemos Filho Dênio Mariz Sousa MTAGS 2nd Workshop on Many-Task Computing on Grids and Supercomputers Co-located
More informationLow Power VLSI Circuits and Systems Prof. Ajit Pal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur
Low Power VLSI Circuits and Systems Prof. Ajit Pal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture No. # 29 Minimizing Switched Capacitance-III. (Refer
More informationSharif University of Technology. SoC: Introduction
SoC Design Lecture 1: Introduction Shaahin Hessabi Department of Computer Engineering System-on-Chip System: a set of related parts that act as a whole to achieve a given goal. A system is a set of interacting
More informationOn Complexity Modeling of H.264/AVC Video Decoding and Its Application for Energy Efficient Decoding
1240 IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 13, NO. 6, DECEMBER 2011 On Complexity Modeling of H.264/AVC Video Decoding and Its Application for Energy Efficient Decoding Zhan Ma, Student Member, IEEE, HaoHu,
More informationFrame Processing Time Deviations in Video Processors
Tensilica White Paper Frame Processing Time Deviations in Video Processors May, 2008 1 Executive Summary Chips are increasingly made with processor designs licensed as semiconductor IP (intellectual property).
More informationMulticore Design Considerations
Multicore Design Considerations Multicore: The Forefront of Computing Technology We re not going to have faster processors. Instead, making software run faster in the future will mean using parallel programming
More informationReal-Time Parallel MPEG-2 Decoding in Software
Real-Time Parallel MPEG-2 Decoding in Software Angelos Bilas, Jason Fritts, Jaswinder Pal Singh Princeton University, Princeton NJ 8544 fbilas@cs, jefritts@ee, jps@csg.princeton.edu Abstract The growing
More informationIntroduction to image compression
Introduction to image compression 1997-2015 Josef Pelikán CGG MFF UK Praha pepca@cgg.mff.cuni.cz http://cgg.mff.cuni.cz/~pepca/ Compression 2015 Josef Pelikán, http://cgg.mff.cuni.cz/~pepca 1 / 12 Motivation
More informationA Low-Power 0.7-V H p Video Decoder
A Low-Power 0.7-V H.264 720p Video Decoder D. Finchelstein, V. Sze, M.E. Sinangil, Y. Koken, A.P. Chandrakasan A-SSCC 2008 Outline Motivation for low-power video decoders Low-power techniques pipelining
More informationAn Efficient Power Saving Latch Based Flip- Flop Design for Low Power Applications
An Efficient Power Saving Latch Based Flip- Flop Design for Low Power Applications N.KIRAN 1, K.AMARNATH 2 1 P.G Student, VRS & YRN College of Engineering & Technology, Vodarevu Road, Chirala 2 HOD & Professor,
More informationDC Ultra. Concurrent Timing, Area, Power and Test Optimization. Overview
DATASHEET DC Ultra Concurrent Timing, Area, Power and Test Optimization DC Ultra RTL synthesis solution enables users to meet today s design challenges with concurrent optimization of timing, area, power
More informationVideo coding standards
Video coding standards Video signals represent sequences of images or frames which can be transmitted with a rate from 5 to 60 frames per second (fps), that provides the illusion of motion in the displayed
More informationScalable Lossless High Definition Image Coding on Multicore Platforms
Scalable Lossless High Definition Image Coding on Multicore Platforms Shih-Wei Liao 2, Shih-Hao Hung 2, Chia-Heng Tu 1, and Jen-Hao Chen 2 1 Graduate Institute of Networking and Multimedia 2 Department
More informationHybrid Discrete-Continuous Computer Architectures for Post-Moore s-law Era
Hybrid Discrete-Continuous Computer Architectures for Post-Moore s-law Era Keynote at the Bi annual HiPEAC Compu6ng Systems Week Mee6ng Barcelona, Spain October 19 th 2010 Prof. Simha Sethumadhavan Columbia
More informationThe Multistandard Full Hd Video-Codec Engine On Low Power Devices
The Multistandard Full Hd Video-Codec Engine On Low Power Devices B.Susma (M. Tech). Embedded Systems. Aurora s Technological & Research Institute. Hyderabad. B.Srinivas Asst. professor. ECE, Aurora s
More informationJoint Optimization of Source-Channel Video Coding Using the H.264/AVC encoder and FEC Codes. Digital Signal and Image Processing Lab
Joint Optimization of Source-Channel Video Coding Using the H.264/AVC encoder and FEC Codes Digital Signal and Image Processing Lab Simone Milani Ph.D. student simone.milani@dei.unipd.it, Summer School
More informationCritical C-RAN Technologies Speaker: Lin Wang
Critical C-RAN Technologies Speaker: Lin Wang Research Advisor: Biswanath Mukherjee Three key technologies to realize C-RAN Function split solutions for fronthaul design Goal: reduce the fronthaul bandwidth
More informationDigital Video Telemetry System
Digital Video Telemetry System Item Type text; Proceedings Authors Thom, Gary A.; Snyder, Edwin Publisher International Foundation for Telemetering Journal International Telemetering Conference Proceedings
More informationVideo 1 Video October 16, 2001
Video Video October 6, Video Event-based programs read() is blocking server only works with single socket audio, network input need I/O multiplexing event-based programming also need to handle time-outs,
More informationESE (ESE534): Computer Organization. Last Time. Today. Last Time. Align Data / Balance Paths. Retiming in the Large
ESE680-002 (ESE534): Computer Organization Day 20: March 28, 2007 Retiming 2: Structures and Balance Last Time Saw how to formulate and automate retiming: start with network calculate minimum achievable
More informationContents. xv xxi xxiii xxiv. 1 Introduction 1 References 4
Contents List of figures List of tables Preface Acknowledgements xv xxi xxiii xxiv 1 Introduction 1 References 4 2 Digital video 5 2.1 Introduction 5 2.2 Analogue television 5 2.3 Interlace 7 2.4 Picture
More informationApproaches to synchronize vision, motion and robotics
Approaches to synchronize vision, motion and robotics Martin Stefik, National Instruments Long-Term Track Record of Growth Revenue: $1.23 billion in 2015 Global Operations: Approximately 7,400 employees;
More informationAUDIOVISUAL COMMUNICATION
AUDIOVISUAL COMMUNICATION Laboratory Session: Recommendation ITU-T H.261 Fernando Pereira The objective of this lab session about Recommendation ITU-T H.261 is to get the students familiar with many aspects
More informationFilm Grain Technology
Film Grain Technology Hollywood Post Alliance February 2006 Jeff Cooper jeff.cooper@thomson.net What is Film Grain? Film grain results from the physical granularity of the photographic emulsion Film grain
More informationPrinciples of Video Compression
Principles of Video Compression Topics today Introduction Temporal Redundancy Reduction Coding for Video Conferencing (H.261, H.263) (CSIT 410) 2 Introduction Reduce video bit rates while maintaining an
More informationNew forms of video compression
New forms of video compression New forms of video compression Why is there a need? The move to increasingly higher definition and bigger displays means that we have increasingly large amounts of picture
More informationModule 8 VIDEO CODING STANDARDS. Version 2 ECE IIT, Kharagpur
Module 8 VIDEO CODING STANDARDS Lesson 27 H.264 standard Lesson Objectives At the end of this lesson, the students should be able to: 1. State the broad objectives of the H.264 standard. 2. List the improved
More informationPerformance and Energy Consumption Analysis of the X265 Video Encoder
Performance and Energy Consumption Analysis of the X265 Video Encoder Dieison Silveira 1,3, Marcelo Porto 2 and Sergio Bampi 1 1 Federal University of Rio Grande do Sul - INF-UFRGS - Graduate Program in
More informationReal-time SHVC Software Decoding with Multi-threaded Parallel Processing
Real-time SHVC Software Decoding with Multi-threaded Parallel Processing Srinivas Gudumasu a, Yuwen He b, Yan Ye b, Yong He b, Eun-Seok Ryu c, Jie Dong b, Xiaoyu Xiu b a Aricent Technologies, Okkiyam Thuraipakkam,
More informationDesign Challenge of a QuadHDTV Video Decoder
Design Challenge of a QuadHDTV Video Decoder Youn-Long Lin Department of Computer Science National Tsing Hua University MPSOC27, Japan More Pixels YLLIN NTHU-CS 2 NHK Proposes UHD TV Broadcast Super HiVision
More informationReduced complexity MPEG2 video post-processing for HD display
Downloaded from orbit.dtu.dk on: Dec 17, 2017 Reduced complexity MPEG2 video post-processing for HD display Virk, Kamran; Li, Huiying; Forchhammer, Søren Published in: IEEE International Conference on
More information17 October About H.265/HEVC. Things you should know about the new encoding.
17 October 2014 About H.265/HEVC. Things you should know about the new encoding Axis view on H.265/HEVC > Axis wants to see appropriate performance improvement in the H.265 technology before start rolling
More informationMilestone Solution Partner IT Infrastructure Components Certification Report
Milestone Solution Partner IT Infrastructure Components Certification Report Infortrend Technologies 5000 Series NVR 12-15-2015 Table of Contents Executive Summary:... 4 Introduction... 4 Certified Products...
More informationLossless Compression Algorithms for Direct- Write Lithography Systems
Lossless Compression Algorithms for Direct- Write Lithography Systems Hsin-I Liu Video and Image Processing Lab Department of Electrical Engineering and Computer Science University of California at Berkeley
More informationInterframe Bus Encoding Technique and Architecture for MPEG-4 AVC/H.264 Video Compression
Interframe Encoding Technique and Architecture for MPEG-4 AVC/H.264 Video Compression Asral Bahari, Tughrul Arslan and Ahmet T. Erdogan Abstract In this paper, we propose an implementation of a data encoder
More informationIntroduction to Video Compression Techniques. Slides courtesy of Tay Vaughan Making Multimedia Work
Introduction to Video Compression Techniques Slides courtesy of Tay Vaughan Making Multimedia Work Agenda Video Compression Overview Motivation for creating standards What do the standards specify Brief
More informationSWITCHED INFINITY: SUPPORTING AN INFINITE HD LINEUP WITH SDV
SWITCHED INFINITY: SUPPORTING AN INFINITE HD LINEUP WITH SDV First Presented at the SCTE Cable-Tec Expo 2010 John Civiletto, Executive Director of Platform Architecture. Cox Communications Ludovic Milin,
More informationSignum BlackHive. Generation II. Broadcast Production System and video server. The new system generation signum.blackhive
Signum BlackHive Generation II Broadcast Production System and video server The new system generation signum.blackhive BlackHive is supporting SD-format (16:9 and 4:3) as well as HDHD formats (720p, 1080i
More informationEFFICIENT DESIGN OF SHIFT REGISTER FOR AREA AND POWER REDUCTION USING PULSED LATCH
EFFICIENT DESIGN OF SHIFT REGISTER FOR AREA AND POWER REDUCTION USING PULSED LATCH 1 Kalaivani.S, 2 Sathyabama.R 1 PG Scholar, 2 Professor/HOD Department of ECE, Government College of Technology Coimbatore,
More informationA CYCLES/MB H.264/AVC MOTION COMPENSATION ARCHITECTURE FOR QUAD-HD APPLICATIONS
9th European Signal Processing Conference (EUSIPCO 2) Barcelona, Spain, August 29 - September 2, 2 A 6-65 CYCLES/MB H.264/AVC MOTION COMPENSATION ARCHITECTURE FOR QUAD-HD APPLICATIONS Jinjia Zhou, Dajiang
More informationChapter 10 Basic Video Compression Techniques
Chapter 10 Basic Video Compression Techniques 10.1 Introduction to Video compression 10.2 Video Compression with Motion Compensation 10.3 Video compression standard H.261 10.4 Video compression standard
More informationMemory Efficient VLSI Architecture for QCIF to VGA Resolution Conversion
Memory Efficient VLSI Architecture for QCIF to VGA Resolution Conversion Asmar A Khan and Shahid Masud Department of Computer Science and Engineering Lahore University of Management Sciences Opp Sector-U,
More informationMotion Video Compression
7 Motion Video Compression 7.1 Motion video Motion video contains massive amounts of redundant information. This is because each image has redundant information and also because there are very few changes
More informationA Real-Time MPEG Software Decoder
DISCLAIMER This report was prepared as an account of work sponsored by an agency of the United States Government. Neither the United States Government nor any agency thereof, nor any of their employees,
More informationComputer and Machine Vision
Computer and Machine Vision Lecture Week 3 Part-1 January 27, 2014 Sam Siewert Outline of Week 3 Processing Images and Moving Pictures High Level View and Computer Architecture for it Linux Platforms for
More informationWorkload Prediction and Dynamic Voltage Scaling for MPEG Decoding
Workload Prediction and Dynamic Voltage Scaling for MPEG Decoding Ying Tan, Parth Malani, Qinru Qiu, Qing Wu Dept. of Electrical & Computer Engineering State University of New York at Binghamton Outline
More informationSequential Circuit Design: Principle
Sequential Circuit Design: Principle modified by L.Aamodt 1 Outline 1. 2. 3. 4. 5. 6. 7. 8. Overview on sequential circuits Synchronous circuits Danger of synthesizing asynchronous circuit Inference of
More informationMilestone Leverages Intel Processors with Intel Quick Sync Video to Create Breakthrough Capabilities for Video Surveillance and Monitoring
white paper Milestone Leverages Intel Processors with Intel Quick Sync Video to Create Breakthrough Capabilities for Video Surveillance and Monitoring Executive Summary Milestone Systems, the world s leading
More informationSlice-Balancing H.264 Video Encoding for Improved Scalability of Multicore Decoding
Slice-Balancing H.264 Video Encoding for Improved Scalability of Multicore Decoding Michael Roitzsch Technische Universität Dresden Department of Computer Science 01062 Dresden, Germany mroi@os.inf.tu-dresden.de
More informationVideo Over Mobile Networks
Video Over Mobile Networks Professor Mohammed Ghanbari Department of Electronic systems Engineering University of Essex United Kingdom June 2005, Zadar, Croatia (Slides prepared by M. Mahdi Ghandi) INTRODUCTION
More informationVideo Technologies for Next Generation Immersive Media
Video Technologies for Next Generation Immersive Media Mauricio Alvarez-Mesa CEO, Spin Digital Berlin, Germany Introduction Spin Digital Video Technologies GmbH (Spin Digital) develops highperformance
More informationJ. Maillard, J. Silva. Laboratoire de Physique Corpusculaire, College de France. Paris, France
Track Parallelisation in GEANT Detector Simulations? J. Maillard, J. Silva Laboratoire de Physique Corpusculaire, College de France Paris, France Track parallelisation of GEANT-based detector simulations,
More informationOutline. 1 Reiteration. 2 Dynamic scheduling - Tomasulo. 3 Superscalar, VLIW. 4 Speculation. 5 ILP limitations. 6 What we have done so far.
Outline 1 Reiteration Lecture 5: EIT090 Computer Architecture 2 Dynamic scheduling - Tomasulo Anders Ardö 3 Superscalar, VLIW EIT Electrical and Information Technology, Lund University Sept. 30, 2009 4
More informationTHE new video coding standard H.264/AVC [1] significantly
832 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 53, NO. 9, SEPTEMBER 2006 Architecture Design of Context-Based Adaptive Variable-Length Coding for H.264/AVC Tung-Chien Chen, Yu-Wen
More informationHardware Implementation of Viterbi Decoder for Wireless Applications
Hardware Implementation of Viterbi Decoder for Wireless Applications Bhupendra Singh 1, Sanjeev Agarwal 2 and Tarun Varma 3 Deptt. of Electronics and Communication Engineering, 1 Amity School of Engineering
More informationWiBench: An Open Source Kernel Suite for Benchmarking Wireless Systems
1 WiBench: An Open Source Kernel Suite for Benchmarking Wireless Systems Qi Zheng*, Yajing Chen*, Ronald Dreslinski*, Chaitali Chakrabarti +, Achilleas Anastasopoulos*, Scott Mahlke*, Trevor Mudge* *,
More informationAsynchronous IC Interconnect Network Design and Implementation Using a Standard ASIC Flow
Asynchronous IC Interconnect Network Design and Implementation Using a Standard ASIC Flow Bradley R. Quinton*, Mark R. Greenstreet, Steven J.E. Wilton*, *Dept. of Electrical and Computer Engineering, Dept.
More informationAltera's 28-nm FPGAs Optimized for Broadcast Video Applications
Altera's 28-nm FPGAs Optimized for Broadcast Video Applications WP-01163-1.0 White Paper This paper describes how Altera s 40-nm and 28-nm FPGAs are tailored to help deliver highly-integrated, HD studio
More informationPerformance mesurement of multiprocessor architectures on FPGA(case study: 3D, MPEG-2)
Performance mesurement of multiprocessor architectures on FPGA(case study: 3D, MPEG-2) Kais LOUKIL #1, Faten BELLAKHDHAR #2, Niez BRADAI *3, Mohamed ABID #4 # Computer Embedded System, National Engineering
More informationPerformance of a Low-Complexity Turbo Decoder and its Implementation on a Low-Cost, 16-Bit Fixed-Point DSP
Performance of a ow-complexity Turbo Decoder and its Implementation on a ow-cost, 6-Bit Fixed-Point DSP Ken Gracie, Stewart Crozier, Andrew Hunt, John odge Communications Research Centre 370 Carling Avenue,
More informationUHD 4K Transmissions on the EBU Network
EUROVISION MEDIA SERVICES UHD 4K Transmissions on the EBU Network Technical and Operational Notice EBU/Eurovision Eurovision Media Services MBK, CFI Geneva, Switzerland March 2018 CONTENTS INTRODUCTION
More informationCacheCompress A Novel Approach for Test Data Compression with cache for IP cores
CacheCompress A Novel Approach for Test Data Compression with cache for IP cores Hao Fang ( 方昊 ) fanghao@mprc.pku.edu.cn Rizhao, ICDFN 07 20/08/2007 To be appeared in ICCAD 07 Sections Introduction Our
More informationHardware Implementation of Block GC3 Lossless Compression Algorithm for Direct-Write Lithography Systems
Hardware Implementation of Block GC3 Lossless Compression Algorithm for Direct-Write Lithography Systems Hsin-I Liu, Brian Richards, Avideh Zakhor, and Borivoje Nikolic Dept. of Electrical Engineering
More informationA VLSI Architecture for Variable Block Size Video Motion Estimation
A VLSI Architecture for Variable Block Size Video Motion Estimation Yap, S. Y., & McCanny, J. (2004). A VLSI Architecture for Variable Block Size Video Motion Estimation. IEEE Transactions on Circuits
More informationModule 8 VIDEO CODING STANDARDS. Version 2 ECE IIT, Kharagpur
Module 8 VIDEO CODING STANDARDS Lesson 24 MPEG-2 Standards Lesson Objectives At the end of this lesson, the students should be able to: 1. State the basic objectives of MPEG-2 standard. 2. Enlist the profiles
More informationCOMP2611: Computer Organization. Introduction to Digital Logic
1 COMP2611: Computer Organization Sequential Logic Time 2 Till now, we have essentially ignored the issue of time. We assume digital circuits: Perform their computations instantaneously Stateless: once
More informationCS 110 Computer Architecture. Finite State Machines, Functional Units. Instructor: Sören Schwertfeger.
CS 110 Computer Architecture Finite State Machines, Functional Units Instructor: Sören Schwertfeger http://shtech.org/courses/ca/ School of Information Science and Technology SIST ShanghaiTech University
More informationFooling the Masses with Performance Results: Old Classics & Some New Ideas
Fooling the Masses with Performance Results: Old Classics & Some New Ideas Gerhard Wellein (1,2), Georg Hager (2) (1) Department for Computer Science (2) Erlangen Regional Computing Center Friedrich-Alexander-Universität
More informationSlack Redistribution for Graceful Degradation Under Voltage Overscaling
Slack Redistribution for Graceful Degradation Under Voltage Overscaling Andrew B. Kahng, Seokhyeong Kang, Rakesh Kumar and John Sartori VLSI CAD LABORATORY, UCSD PASSAT GROUP, UIUC UCSD VLSI CAD Laboratory
More informationFPGA Development for Radar, Radio-Astronomy and Communications
John-Philip Taylor Room 7.03, Department of Electrical Engineering, Menzies Building, University of Cape Town Cape Town, South Africa 7701 Tel: +27 82 354 6741 email: tyljoh010@myuct.ac.za Internet: http://www.uct.ac.za
More informationAN MPEG-4 BASED HIGH DEFINITION VTR
AN MPEG-4 BASED HIGH DEFINITION VTR R. Lewis Sony Professional Solutions Europe, UK ABSTRACT The subject of this paper is an advanced tape format designed especially for Digital Cinema production and post
More informationMPEG decoder Case. K.A. Vissers UC Berkeley Chamleon Systems Inc. and Pieter van der Wolf. Philips Research Eindhoven, The Netherlands
MPEG decoder Case K.A. Vissers UC Berkeley Chamleon Systems Inc. and Pieter van der Wolf Philips Research Eindhoven, The Netherlands 1 Outline Introduction Consumer Electronics Kahn Process Networks Revisited
More informationFPGA based Satellite Set Top Box prototype design
9 th International conference on Sciences and Techniques of Automatic control & computer engineering FPGA based Satellite Set Top Box prototype design Mohamed Frad 1,2, Lamjed Touil 1, Néji Gabsi 2, Abdessalem
More informationChapter 3 Instruction-Level Parallelism and its Exploitation (Part 1)
Chapter 3 Instruction-Level Parallelism and its Exploitation (Part 1) ILP vs. Parallel Computers Dynamic Scheduling (Section 3.4, 3.5) Dynamic Branch Prediction (Section 3.3) Hardware Speculation and Precise
More informationOverview: Video Coding Standards
Overview: Video Coding Standards Video coding standards: applications and common structure ITU-T Rec. H.261 ISO/IEC MPEG-1 ISO/IEC MPEG-2 State-of-the-art: H.264/AVC Video Coding Standards no. 1 Applications
More informationHigh-Efficient Parallel CAVLC Encoders on Heterogeneous Multicore Architectures
46 H. Y. SU, M. WEN, J. REN, N. WU, J. CHAI, C.Y. ZHANG, HIGH-EFFICIENT PARALLEL CAVLC ENCODER High-Efficient Parallel CAVLC Encoders on Heterogeneous Multicore Architectures Huayou SU, Mei WEN, Ju REN,
More informationMotion Compensation Hardware Accelerator Architecture for H.264/AVC
Motion Compensation Hardware Accelerator Architecture for H.264/AVC Bruno Zatt 1, Valter Ferreira 1, Luciano Agostini 2, Flávio R. Wagner 1, Altamiro Susin 3, and Sergio Bampi 1 1 Informatics Institute
More informationDesign and Analysis of Modified Fast Compressors for MAC Unit
Design and Analysis of Modified Fast Compressors for MAC Unit Anusree T U 1, Bonifus P L 2 1 PG Student & Dept. of ECE & Rajagiri School of Engineering & Technology 2 Assistant Professor & Dept. of ECE
More informationParallel SHVC decoder: Implementation and analysis
Parallel SHVC decoder: Implementation and analysis Wassim Hamidouche, Mickaël Raulet, Olivier Deforges To cite this version: Wassim Hamidouche, Mickaël Raulet, Olivier Deforges. Parallel SHVC decoder:
More informationAN-ENG-001. Using the AVR32 SoC for real-time video applications. Written by Matteo Vit, Approved by Andrea Marson, VERSION: 1.0.0
Written by Matteo Vit, R&D Engineer Dave S.r.l. Approved by Andrea Marson, CTO Dave S.r.l. DAVE S.r.l. www.dave.eu VERSION: 1.0.0 DOCUMENT CODE: AN-ENG-001 NO. OF PAGES: 8 AN-ENG-001 Using the AVR32 SoC
More informationDesign of Fault Coverage Test Pattern Generator Using LFSR
Design of Fault Coverage Test Pattern Generator Using LFSR B.Saritha M.Tech Student, Department of ECE, Dhruva Institue of Engineering & Technology. Abstract: A new fault coverage test pattern generator
More informationEN2911X: Reconfigurable Computing Topic 01: Programmable Logic. Prof. Sherief Reda School of Engineering, Brown University Fall 2014
EN2911X: Reconfigurable Computing Topic 01: Programmable Logic Prof. Sherief Reda School of Engineering, Brown University Fall 2014 1 Contents 1. Architecture of modern FPGAs Programmable interconnect
More informationWHITE PAPER. Perspectives and Challenges for HEVC Encoding Solutions. Xavier DUCLOUX, December >>
Perspectives and Challenges for HEVC Encoding Solutions Xavier DUCLOUX, December 2013 >> www.thomson-networks.com 1. INTRODUCTION... 3 2. HEVC STATUS... 3 2.1 HEVC STANDARDIZATION... 3 2.2 HEVC TOOL-BOX...
More information