Parallelization of Multimedia Applications by Compiler on Multicores for Consumer Electronics
|
|
- Adela Williamson
- 5 years ago
- Views:
Transcription
1 Vol. 0 No TV MPEG2 MP3 JPEG 2000 OSCAR API VLIW 4 FR1000 SH-4A 4 RP1 FR RP Parallelization of Multimedia Applications by Compiler on Multicores for Consumer Electronics Takamichi Miyamoto, Saori Asaka, Hiroki Mikami, Masayoshi Mase, Keiji Kimura and Hironori Kasahara Multicore processors have attracted much attention because there are much opportunity to overcome the increase of power consumption, the difficulity of improvement of processor clock speed and the increase of hardware/software developing period. Also speeding up multimedia applications is required with the progress of consumer electronics like mobile phones digital TV and games. This paper evaluates parallel processing performances of multimedia applications such as MPEG2 encode and decode, MP3 encode and JPEG 2000 encode using newly developed multicore API by OSCAR parallelizing compiler on the FR VLIW cores multicore processor developed by Fujitsu Ltd, and the RP1 4 SH-4A cores multicore processor jointly-developed by Renesas Technology Corp., Hitachi Ltd. and Waseda University. As the results, the OSCAR comiler gave us 3.27 speedup in average using 4 cores on FR1000 multicore, 3.31 speedup in average using 4 cores on RP1 multicore. 1. Waseda University OS Quad Xeon Core 2 Quad AMD Quad Phenom IBM Power6 Sun SPARC T2 SCE/IBM/ Cell 1) NEC /ARM MPCore MP211 2) FR1000 3) UniPhier 4) 1
2 SH-X3 5) DSP Intel SSE SIMD NEDO API API API The Multicore ASSOCIATION 6) API API API API OSCAR API FR1000 RP1 2 3 API API 4 OSCAR 5 FR1000 RP1 6 CMP (chip multiprocessor 0) 0 LPM/ I-Cache CPU LDM/ D-cache DSM Network Interface CSM / L2 Cache PE 0 PE1 PE n Intra-chip connection network (Multiple Buses, Crossbar, etc) m CSM Inter-chip connection network (Crossbar, Buses, Multistage network, etc) j I/O I/O Devices Devices k 1 OSCAR Fig. 1 OSCAR Multicore Architecture 2. NEDO API FR1000 MP211 CELL UniPhier API OSCAR OSCAR FR1000 OSCAR RP1 2.1 OSCAR OSCAR OSCAR 7),8) 1 OSCAR 1 PE PE LDM DSM LPM CPU DTU PE Interconnection Network CSM
3 Vol. 0 No. 0 3 Core 0 Primary I$:32KB WorkRAM D$:32KB 128KB Core 2 I$:32KB D$:32KB WorkRAM 128KB Chip Local Memory Bus Inter-Processor Communication Bus Main Memory Controller Internal DMAC WorkRAM 128KB WorkRAM 128KB Core 1 I$:32KB D$:32KB Core 3 I$:32KB D$:32KB snoop bus Core 3 Core 2 CPU FPU Core 1 CPU I$ FPU D$ 32K CCN Core 0 CPU I$ 32K ILRA FPU D$ CCN CPU I$ 32K FPU M DTU 32K OLRAM ILRA D$ CCN 16K 32K M 8K I$ DTU 32K OLRAM ILRA D$ CCN URAM 16K128K 32K M DTU 8K 32K OLRAM ILRAM 8K URAM 16K128K DTU OLRAM 8K URAM 16K128K URAM 128K snoop controller (SNC) DDR-SDRAM (off-chip CSM) 2 FR1000 Fig. 2 Block Diagram of FR FR1000 FR way VLIW FR KB 32KB OSCAR DSM 128KB WorkRAM 2 DTU DMAC off-chip CSM 2 1GB RP1 RP1 3 SH-4A SH- X3 4 32KB 32KB OSCAR LDM 16KB OLRAM LPM 8KB ILRAM DSM 128KB URAM on-chip CSM 128KB CSM SMP AMP SMP 3. API FR1000 RP1 API LBSC SRAM On-chip system bus (SHwy) DBSC DDR2 SDRAM CSM 128K 3 RP1 Fig. 3 Block Diagram of RP1 API OpenMP API 9) API C FORTRAN SMP OpenMP API API API 4 4 C FORTRAN OSCAR API C FORTRAN API API 4. OSCAR OSCAR 4.1 BPA
4 C 2 3 OSCAR Data Dependency Control Flow API Proc0 Scheduled Tasks Proc1 Scheduled Tasks... ProcN Scheduled Tasks Conditional Branch API API... API Data Dependency Extended Control Dependency Conditional Branch AND OR Original Control Flow (a) Macro Flow Graph(MFG) (b) Macro Task Graph(MTFG) - 5 Fig. 5 Macro flow graph and Macro-task graph FR1000 RP1 Z 4 API Fig. 4 Evaluation Flow on Multicores using Developed API RB SB 3 MT RB SB MT MT 5 a MFG 5 a MT MFG MFG MT MT MT 5 b MTG MFG MTG MTG MT ) DLG 4.3 PEPG MTG MT PG ETF/CP Earliest Task First/Critical Path DLG MT PG ETF/CP considering DLG PG PE 5. API FR1000 RP1 OSCAR OSCAR FR1000 RP1
5 4 MT52 8 MT56 MT51 MT55 MT50 MT MT53 Vol. 0 No. 0 5 void mpeg2decode() { unsigned char arr[16][22][16][16]; : for (i=0; i<16; i++) { process_slice(arr[i],...); } : } Process one picture Process one macroblock doall2 doall9 doall3 07 doall4 18doall25 doall2 doall5 29doall26 loop33 doall3 doall6 3doall20doall27 loop34 doall41 OSCAR Compiler doall7 4doall21doall28 loop35 doall42doall49 doall4 doall8 5doall22doall29 loop36 doall43doall50 void process_slice(unsigned char rarr[22][16][16],...); 6 OSCAR Fig. 6 Loop Description for Parallelization by OSCAR Compiler loop5 doall6 doall7 emt8 6doall23doall30 loop37doall44doall51 doall24doall31 loop38doall45doall52 doall32 loop39doall46doall53 loop40doall47doall54 doall48doall55 doall OSCAR C C OSCAR C MPEG2 MPEG2 MP3 JPEG2000 OSCAR MPEG2 MPEG2 DCT 7 MPEG2 Group of Picture GOP MPEG2 GOP 7 PE0 PE1 PE2 PE3 7 MPEG2 Fig. 7 Parallelization MPEG2 encode using Locality 0.0E+00 MT1 MT17MT25 MT2 MT18MT26 MT19MT27 MT20MT MT5 6 MT6 MT7 emt57 Processing macroblocks in parallel MT MT8 MT MPEG2 4PE Fig. 8 Scheduling Result of MPEG2 encode on 4PE MTG OSCAR MTG 8 4PE 8 PE MPEG2 MPEG2 6 MPEG2 11) 9 MTG OSCAR
6 8 7 6 MT Process one picture loop1 doall2 emt3 Inner Layer (Process one slice) sb1 bb2 sb3 doall4 doall5 doall7 doall6 OSCAR Compiler loop1 doall9 loop2 0 Process one slice loop3 1 3 loop4 loop5 2 loop6 loop7 4 loop8 5 Inner Layer sb1 bb2 sb3 doall4 doall5 doall6 doall7 doall8 doall Process some frames doall2 loop9 7 doall3 loop10doall258 doall4 loop119 doall33doall26 doall20 loop12 doall5 doall27 doall41doall34 loop2 doall3 doall4 doall5 Process one frame OSCAR Compiler doall28 loop13doall21 doall35 doall6 doall42 doall36doall29doall43 doall7 loop14 doall22 doall44doall37 doall8 loop15doall23doall30 doall45doall24 loop16 doall31 doall38 doall6 doall32 doall39 doall46 6 emt emt7 doall40 doall47 emt17 Process one macroblock 9 MPEG2 Fig. 9 Parallelization MPEG2 decode emt20 doall48 emt49 11 MP3 Fig. 11 Parallelization MP3 encode 10 Fig. 10 PE0 PE1 PE2 PE3 using Locality MT1 MT2 MT9 MT10 MT11 MT12 MT5 MT6 MT11 MT15 Processing slices in parallel MT7 MT7 MT8 MT13 MT19 MT8 MT12 MT14 Inner Layer MT15 MT16 MT9 MT13 MT16 MT5 MT17 MT10 MT14 MT6 MT18 using Locality MPEG2 4PE Scheduling Result of MPEG2 decode on 4PE MTG 10 4PE 10 PE MP3 MP3 5 MP3 MP3 11 MTG OSCAR MTG 12 4PE 12 PE0 PE1 PE2 PE3 using Locality MT1 MT2 0.0E+00 MT25 MT10 MT26 MT11 MT27 MT12 MT Processing frames in parallel 1MT5 2MT6 3MT7 4MT8 MT13 MT29 MT14 0 MT15 1 MT MP3 4PE Fig. 12 Scheduling Result of MP3 encode on 4PE PE JPEG 2000 JPEG 2000 DC DWT EBCOT Embedded Block Coding with Optimized Truncation 4 JPEG 2000 DWT 2 64x64 JPEG 2000 DC DWT EBCOT DWT 13 MTG OSCAR DC DWT
7 MT14 MT16 MT17 MT19 MT20 MT18 MT13 MT15 Vol. 0 No. 0 7 Process one picture doall2 doall3 doall4 doall5 doall6 doall7 loop8 loop9 emt10 OSCAR Compiler sb31 sb41 sb34 sb44 doall2 doall3 doall4 doall5 doall6 doall7 doall8 doall doall209 doall21doall22doall23doall24 doall25doall26doall27doall28 sb37 sb47 Process some lines sb38 sb48 sb36 sb46 emt49 sb35 sb45 sb29 sb39 sb32 sb42 Process one codeblock 13 JPEG 2000 Fig. 13 Parallelization JPEG 2000 encode Processing some lines and codeblocks in parallel PE0 PE1 PE2 PE3 MT1 MT5 MT9 MT2 MT6 MT10 MT7 MT11 MT8 MT12 0.0E MT29 MT21 MT22 MT23 MT24 MT25 MT26 MT27 MT sb30 sb40 sb33 sb JPEG PE Fig. 14 Scheduling Result of JPEG 2000 encode on 4PE EBCOT MTG 14 4PE MPEG2 MediaBench 12) MP3 UZURA MP3 encoder 13) JPEG 2000 JJ ) 5.1 C FR1000 MPEG2 DCT Intel ) MPEG2 30 SIF 352x240 MPEG2 30 SIF 352x240 MP3 32 PCM JPEG x300 JPEG 2000 OSCAR 5.4 FR FR1000 API gcc O3 API WorkRAM CSM API WorkRAM CSM MP3 gcc O0 OSCAR API 4PE 1PE MPEG MPEG MP JPEG CSM MP3 JPEG RP1 16 RP1 Speedup vs 1PE PE 2PE 3PE 4PE 1PE 2PE 3PE 4PE 1PE 2PE 3PE 4PE 1PE 2PE 3PE 4PE MPEG2enc MPEG2dec MP3enc JPEG 2000enc 15 FR1000 Fig. 15 Evaluation Result on FR1000 Multicore
8 RP1 SMP API SH SH C SMP OSCAR API 4PE 1PE MPEG MPEG MP JPEG SMP RP1 SMP JPEG PE 6. API OSCAR FR1000 RP1 FR1000 4PE 1PE MPEG MPEG MP JPEG RP1 4PE 1PE MPEG MPEG MP JPEG API OSCAR NEDO NEDO NEDO Speedup vs 1PE PE 2PE 3PE 4PE 1PE 2PE 3PE 4PE 1PE 2PE 3PE 4PE 1PE 2PE 3PE 4PE MPEG2enc MPEG2dec MP3enc JPEG 2000enc 1) Pham, D. et al.: The Design and Implementation of a First-Generation CELL Processor, In Proceeding of the IEEE International Solid- State Circuits Conference (2005). 2) Cornish, J.: Balanced Energy Optimization, International Symposium on Low Power Electronics and Design (2004). 3) Suga, A. et al.: FR-V Single-Chip Multicore Processor:FR1000, Fujitsu Sci Tech J, Vol. 42, No. 2, pp (2006). 4) : UniPhier, DA (2005). 5) Kamei, T.: SH-X3 : An Enhanced SuperH Core for Low-power MP Systems, Fall Microprocessor Forum 2006 (2006). 6) The Multicore ASSOCIATION: multicore-association.org/. 7) :,, Vol. 40, No. 5 (1999). 8) Kimura, K. et al.: Multigrain Parallel Processing on Compiler Cooperative Chip Multiprocessor, Proc. of 9th Workshop on Interaction between Compilers and Computer Architectures (INTERACT-9) (2005). 9) OpenMP Application Program Interface Version 2.5 (2005). 10) :,, Vol. 43, No. 4 (2002). 11) Iwata, E. et al.: Exploiting Coarse-Grain Parallelism in the MPEG-2 Algorithm, Technical Report CSL-TR (1998). 12) C. Lee et al.: MediaBench: A Tool for Evaluating and Synthesizing Multimedia and Communications Systems, 30th International Symposium on Microarchitecture (MICRO-30) (1997). 13) UZURA3:MPEG1/LayerIII Encoder in FOR- TRAN90. kitaurawa/index e.html. 14) R Grosbois et al.: 15) Intel: A Fast Precise Implementatioin of 8x8 Discrete Consine Transform Using the Streaming SIMD Extensions and MMX Instructions (1999). AP-922, Order Number RP1 Fig. 16 Evaluation Result on RP1 Multicore
OSCAR Compiler Controlled Multicore Power Reduction on Android Platform
OSCAR Compiler Controlled Multicore Power Reduction on Android Platform Hideo Yamamoto 1, Tomohiro Hirano 1, Kohei Muto 1, Hiroki Mikami 1, Takashi Goto 1, Dominic Hillenbrand 1, Moriyuki Takamura 2,KeijiKimura
More informationImplementation of an MPEG Codec on the Tilera TM 64 Processor
1 Implementation of an MPEG Codec on the Tilera TM 64 Processor Whitney Flohr Supervisor: Mark Franklin, Ed Richter Department of Electrical and Systems Engineering Washington University in St. Louis Fall
More informationA low-power portable H.264/AVC decoder using elastic pipeline
Chapter 3 A low-power portable H.64/AVC decoder using elastic pipeline Yoshinori Sakata, Kentaro Kawakami, Hiroshi Kawaguchi, Masahiko Graduate School, Kobe University, Kobe, Hyogo, 657-8507 Japan Email:
More informationLUT OPTIMIZATION USING COMBINED APC-OMS TECHNIQUE
LUT OPTIMIZATION USING COMBINED APC-OMS TECHNIQUE S.Basi Reddy* 1, K.Sreenivasa Rao 2 1 M.Tech Student, VLSI System Design, Annamacharya Institute of Technology & Sciences (Autonomous), Rajampet (A.P),
More informationPerformance mesurement of multiprocessor architectures on FPGA(case study: 3D, MPEG-2)
Performance mesurement of multiprocessor architectures on FPGA(case study: 3D, MPEG-2) Kais LOUKIL #1, Faten BELLAKHDHAR #2, Niez BRADAI *3, Mohamed ABID #4 # Computer Embedded System, National Engineering
More informationAmdahl s Law in the Multicore Era
Amdahl s Law in the Multicore Era Mark D. Hill and Michael R. Marty University of Wisconsin Madison August 2008 @ Semiahmoo Workshop IBM s Dr. Thomas Puzak: Everyone knows Amdahl s Law 2008 Multifacet
More informationA Real-Time MPEG Software Decoder
DISCLAIMER This report was prepared as an account of work sponsored by an agency of the United States Government. Neither the United States Government nor any agency thereof, nor any of their employees,
More informationInterframe Bus Encoding Technique for Low Power Video Compression
Interframe Bus Encoding Technique for Low Power Video Compression Asral Bahari, Tughrul Arslan and Ahmet T. Erdogan School of Engineering and Electronics, University of Edinburgh United Kingdom Email:
More informationVerification Methodology for a Complex System-on-a-Chip
UDC 621.3.049.771.14.001.63 Verification Methodology for a Complex System-on-a-Chip VAkihiro Higashi VKazuhide Tamaki VTakayuki Sasaki (Manuscript received December 1, 1999) Semiconductor technology has
More informationA Single-chip MPEG2 Video Encoder LSI with Multi-chip Configuration for a Single-board Encoder
A Single-chip MPEG2 MP@ML Video Encoder LSI with Multi-chip Configuration for a Single-board MP@HL Encoder T. Minami, T. Kondo, K. Nitta, K. Suguri, M. Ikeda, T. Yoshitome, H. Watanabe, H. Iwasaki, K.
More informationOutline. 1 Reiteration. 2 Dynamic scheduling - Tomasulo. 3 Superscalar, VLIW. 4 Speculation. 5 ILP limitations. 6 What we have done so far.
Outline 1 Reiteration Lecture 5: EIT090 Computer Architecture 2 Dynamic scheduling - Tomasulo Anders Ardö 3 Superscalar, VLIW EIT Electrical and Information Technology, Lund University Sept. 30, 2009 4
More informationScalability of MB-level Parallelism for H.264 Decoding
Scalability of Macroblock-level Parallelism for H.264 Decoding Mauricio Alvarez Mesa 1, Alex Ramírez 1,2, Mateo Valero 1,2, Arnaldo Azevedo 3, Cor Meenderinck 3, Ben Juurlink 3 1 Universitat Politècnica
More informationScalable Lossless High Definition Image Coding on Multicore Platforms
Scalable Lossless High Definition Image Coding on Multicore Platforms Shih-Wei Liao 2, Shih-Hao Hung 2, Chia-Heng Tu 1, and Jen-Hao Chen 2 1 Graduate Institute of Networking and Multimedia 2 Department
More informationComputer and Machine Vision
Computer and Machine Vision Lecture Week 3 Part-1 January 27, 2014 Sam Siewert Outline of Week 3 Processing Images and Moving Pictures High Level View and Computer Architecture for it Linux Platforms for
More informationCS A490 Digital Media and Interactive Systems
CS A490 Digital Media and Interactive Systems Lecture 8 Review of Digital Video Encoding/Decoding and Transport October 7, 2013 Sam Siewert MT Review Scheduling Taxonomy and Architecture Traditional CPU
More informationTHE architecture of present advanced video processing BANDWIDTH REDUCTION FOR VIDEO PROCESSING IN CONSUMER SYSTEMS
BANDWIDTH REDUCTION FOR VIDEO PROCESSING IN CONSUMER SYSTEMS Egbert G.T. Jaspers 1 and Peter H.N. de With 2 1 Philips Research Labs., Prof. Holstlaan 4, 5656 AA Eindhoven, The Netherlands. 2 CMG Eindhoven
More informationReal-Time Parallel MPEG-2 Decoding in Software
Real-Time Parallel MPEG-2 Decoding in Software Angelos Bilas, Jason Fritts, Jaswinder Pal Singh Princeton University, Princeton NJ 8544 fbilas@cs, jefritts@ee, jps@csg.princeton.edu Abstract The growing
More informationHEVC Real-time Decoding
HEVC Real-time Decoding Benjamin Bross a, Mauricio Alvarez-Mesa a,b, Valeri George a, Chi-Ching Chi a,b, Tobias Mayer a, Ben Juurlink b, and Thomas Schierl a a Image Processing Department, Fraunhofer Institute
More informationAdvanced System LSIs for Home 3D Systems
ASP-DAC2011 Session 8D-1 Advanced System LSIs for Home 3D Systems January 28, 2011 Takao Suzuki Panasonic Corporation Strategic Semiconductor Development Center Agenda 1. Overview of 3D Systems - Principles
More informationLogic Devices for Interfacing, The 8085 MPU Lecture 4
Logic Devices for Interfacing, The 8085 MPU Lecture 4 1 Logic Devices for Interfacing Tri-State devices Buffer Bidirectional Buffer Decoder Encoder D Flip Flop :Latch and Clocked 2 Tri-state Logic Outputs
More informationABSTRACT. Keywords: Resource estimation, H.264, components, multimedia, architecture
Resource Estimation Methodology for Multimedia Applications Hari Kalva, Ravi Shankar, Tuhina Patel and Camilo Cruz 1 Dept. of Computer Science and Engineering, Florida Atlantic University, Boca Raton,
More informationReduced complexity MPEG2 video post-processing for HD display
Downloaded from orbit.dtu.dk on: Dec 17, 2017 Reduced complexity MPEG2 video post-processing for HD display Virk, Kamran; Li, Huiying; Forchhammer, Søren Published in: IEEE International Conference on
More informationMPEG decoder Case. K.A. Vissers UC Berkeley Chamleon Systems Inc. and Pieter van der Wolf. Philips Research Eindhoven, The Netherlands
MPEG decoder Case K.A. Vissers UC Berkeley Chamleon Systems Inc. and Pieter van der Wolf Philips Research Eindhoven, The Netherlands 1 Outline Introduction Consumer Electronics Kahn Process Networks Revisited
More informationA parallel HEVC encoder scheme based on Multi-core platform Shu Jun1,2,3,a, Hu Dong1,2,3,b
4th National Conference on Electrical, Electronics and Computer Engineering (NCEECE 2015) A parallel HEVC encoder scheme based on Multi-core platform Shu Jun1,2,3,a, Hu Dong1,2,3,b 1 Education Ministry
More informationPRACE Autumn School GPU Programming
PRACE Autumn School 2010 GPU Programming October 25-29, 2010 PRACE Autumn School, Oct 2010 1 Outline GPU Programming Track Tuesday 26th GPGPU: General-purpose GPU Programming CUDA Architecture, Threading
More informationA Highly Scalable Parallel Implementation of H.264
A Highly Scalable Parallel Implementation of H.264 Arnaldo Azevedo 1, Ben Juurlink 1, Cor Meenderinck 1, Andrei Terechko 2, Jan Hoogerbrugge 3, Mauricio Alvarez 4, Alex Ramirez 4,5, Mateo Valero 4,5 1
More informationFOR MULTIMEDIA mobile systems powered by a battery
IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 7, NO. 1, FEBRUARY 2005 67 ITRON-LP: Power-Conscious Real-Time OS Based on Cooperative Voltage Scaling for Multimedia Applications Hiroshi Kawaguchi, Member, IEEE,
More informationOL_H264MCLD Multi-Channel HDTV H.264/AVC Limited Baseline Video Decoder V1.0. General Description. Applications. Features
OL_H264MCLD Multi-Channel HDTV H.264/AVC Limited Baseline Video Decoder V1.0 General Description Applications Features The OL_H264MCLD core is a hardware implementation of the H.264 baseline video compression
More informationLossless Compression Algorithms for Direct- Write Lithography Systems
Lossless Compression Algorithms for Direct- Write Lithography Systems Hsin-I Liu Video and Image Processing Lab Department of Electrical Engineering and Computer Science University of California at Berkeley
More informationA STUDY OF REAL-TIME AND RATE SCALABLE IMAGE AND VIDEO COMPRESSION. AThesis Submitted to the Faculty. Purdue University. Ke Shen
A STUDY OF REAL-TIME AND RATE SCALABLE IMAGE AND VIDEO COMPRESSION AThesis Submitted to the Faculty of Purdue University by Ke Shen In Partial Fulfillment of the Requirements for the Degree of Doctor of
More informationFrame Processing Time Deviations in Video Processors
Tensilica White Paper Frame Processing Time Deviations in Video Processors May, 2008 1 Executive Summary Chips are increasingly made with processor designs licensed as semiconductor IP (intellectual property).
More informationDesigning for High Speed-Performance in CPLDs and FPGAs
Designing for High Speed-Performance in CPLDs and FPGAs Zeljko Zilic, Guy Lemieux, Kelvin Loveless, Stephen Brown, and Zvonko Vranesic Department of Electrical and Computer Engineering University of Toronto,
More informationA Low Power Delay Buffer Using Gated Driver Tree
IOSR Journal of VLSI and Signal Processing (IOSR-JVSP) ISSN: 2319 4200, ISBN No. : 2319 4197 Volume 1, Issue 4 (Nov. - Dec. 2012), PP 26-30 A Low Power Delay Buffer Using Gated Driver Tree Kokkilagadda
More informationMulticore Design Considerations
Multicore Design Considerations Multicore: The Forefront of Computing Technology We re not going to have faster processors. Instead, making software run faster in the future will mean using parallel programming
More informationSoC and SiP technology for digital consumer electronic systems
and SiP technology for digital consumer electronic systems Akira Matsuzawa Tokyo Institute of Technology 2005 06 20 A. Matsuzawa, Titech 1 Contents Digital consumer electronic systems and technology for
More informationA Real-Time Encoding and Decoding System for Nonlinear HDTV Editor
A Real-Time Encoding and Decoding System for Nonlinear HDTV Editor Chul Soo Lee, 1 JoonHong Park, 1 DooSoo Yoon, 1 JaeHo Jeon, 1 Hyun Wook Park, 1 Ji Hee Yeo, 2 Jong Hwa Lee 2 1 Department of Electrical
More informationFPGA based Satellite Set Top Box prototype design
9 th International conference on Sciences and Techniques of Automatic control & computer engineering FPGA based Satellite Set Top Box prototype design Mohamed Frad 1,2, Lamjed Touil 1, Néji Gabsi 2, Abdessalem
More informationMauricio Álvarez-Mesa ; Chi Ching Chi ; Ben Juurlink ; Valeri George ; Thomas Schierl Parallel video decoding in the emerging HEVC standard
Mauricio Álvarez-Mesa ; Chi Ching Chi ; Ben Juurlink ; Valeri George ; Thomas Schierl Parallel video decoding in the emerging HEVC standard Conference object, Postprint version This version is available
More informationA High-Performance Parallel CAVLC Encoder on a Fine-Grained Many-core System
A High-Performance Parallel CAVLC Encoder on a Fine-Grained Many-core System Zhibin Xiao and Bevan M. Baas VLSI Computation Lab, ECE Department University of California, Davis Outline Introduction to H.264
More informationREAL-TIME H.264 ENCODING BY THREAD-LEVEL PARALLELISM: GAINS AND PITFALLS
REAL-TIME H.264 ENCODING BY THREAD-LEVEL ARALLELISM: GAINS AND ITFALLS Guy Amit and Adi inhas Corporate Technology Group, Intel Corp 94 Em Hamoshavot Rd, etah Tikva 49527, O Box 10097 Israel {guy.amit,
More informationLUT Design Using OMS Technique for Memory Based Realization of FIR Filter
International Journal of Emerging Engineering Research and Technology Volume. 2, Issue 6, September 2014, PP 72-80 ISSN 2349-4395 (Print) & ISSN 2349-4409 (Online) LUT Design Using OMS Technique for Memory
More informationModule 8 VIDEO CODING STANDARDS. Version 2 ECE IIT, Kharagpur
Module 8 VIDEO CODING STANDARDS Lesson 27 H.264 standard Lesson Objectives At the end of this lesson, the students should be able to: 1. State the broad objectives of the H.264 standard. 2. List the improved
More informationInnovative Fast Timing Design
Innovative Fast Timing Design Solution through Simultaneous Processing of Logic Synthesis and Placement A new design methodology is now available that offers the advantages of enhanced logical design efficiency
More informationJPEG 2000 [1] [4] uses two key components, discrete
IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 9, NO. 6, OCTOBER 2007 1103 Word-Level Parallel Architecture of JPEG 2000 Embedded Block Coding Decoder Yu-Wei Chang, Hung-Chi Fang, Chun-Chia Chen, Chung-Jr Lian,
More informationMinimax Disappointment Video Broadcasting
Minimax Disappointment Video Broadcasting DSP Seminar Spring 2001 Leiming R. Qian and Douglas L. Jones http://www.ifp.uiuc.edu/ lqian Seminar Outline 1. Motivation and Introduction 2. Background Knowledge
More informationEFFICIENT DESIGN OF SHIFT REGISTER FOR AREA AND POWER REDUCTION USING PULSED LATCH
EFFICIENT DESIGN OF SHIFT REGISTER FOR AREA AND POWER REDUCTION USING PULSED LATCH 1 Kalaivani.S, 2 Sathyabama.R 1 PG Scholar, 2 Professor/HOD Department of ECE, Government College of Technology Coimbatore,
More informationTomasulo Algorithm Based Out of Order Execution Processor
Tomasulo Algorithm Based Out of Order Execution Processor Bhavana P.Shrivastava MAaulana Azad National Institute of Technology, Department of Electronics and Communication ABSTRACT In this research work,
More informationTOWARD A FOCUSED MARKET William Bricken September A variety of potential markets for the CoMesh product. TARGET MARKET APPLICATIONS
TOWARD A FOCUSED MARKET William Bricken September 2002 A variety of potential markets for the CoMesh product. POTENTIAL TARGET MARKET APPLICATIONS set-top boxes direct broadcast reception signal encoding
More informationHighly Parallel HEVC Decoding for Heterogeneous Systems with CPU and GPU
2017. This manuscript version (accecpted manuscript) is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/. Highly Parallel HEVC Decoding for Heterogeneous
More informationA video signal processor for motioncompensated field-rate upconversion in consumer television
A video signal processor for motioncompensated field-rate upconversion in consumer television B. De Loore, P. Lippens, P. Eeckhout, H. Huijgen, A. Löning, B. McSweeney, M. Verstraelen, B. Pham, G. de Haan,
More informationThe Multistandard Full Hd Video-Codec Engine On Low Power Devices
The Multistandard Full Hd Video-Codec Engine On Low Power Devices B.Susma (M. Tech). Embedded Systems. Aurora s Technological & Research Institute. Hyderabad. B.Srinivas Asst. professor. ECE, Aurora s
More informationInterframe Bus Encoding Technique and Architecture for MPEG-4 AVC/H.264 Video Compression
Interframe Encoding Technique and Architecture for MPEG-4 AVC/H.264 Video Compression Asral Bahari, Tughrul Arslan and Ahmet T. Erdogan Abstract In this paper, we propose an implementation of a data encoder
More informationDesign and Implementation of Partial Reconfigurable Fir Filter Using Distributed Arithmetic Architecture
Design and Implementation of Partial Reconfigurable Fir Filter Using Distributed Arithmetic Architecture Vinaykumar Bagali 1, Deepika S Karishankari 2 1 Asst Prof, Electrical and Electronics Dept, BLDEA
More informationDIGITAL TV RESEARCH LINE
UNIVERSIDAD POLITÉCNICA DE MADRID GRUPO DE DISEÑO ELECTRÓNICO Y MICROELECTRÓNICO DIGITAL TV RESEARCH LINE Document: Digital_TV_Research.doc Author: GDEM Data: 24 / 01 / 2011 E.U.I.T. Telecomunicación.
More informationConference object, Postprint version This version is available at
Benjamin Bross, Valeri George, Mauricio Alvarez-Mesay, Tobias Mayer, Chi Ching Chi, Jens Brandenburg, Thomas Schierl, Detlev Marpe, Ben Juurlink HEVC performance and complexity for K video Conference object,
More informationMEMORY ERROR COMPENSATION TECHNIQUES FOR JPEG2000. Yunus Emre and Chaitali Chakrabarti
MEMORY ERROR COMPENSATION TECHNIQUES FOR JPEG2000 Yunus Emre and Chaitali Chakrabarti School of Electrical, Computer and Energy Engineering Arizona State University, Tempe, AZ 85287 {yemre,chaitali}@asu.edu
More informationHardware Implementation of Block GC3 Lossless Compression Algorithm for Direct-Write Lithography Systems
Hardware Implementation of Block GC3 Lossless Compression Algorithm for Direct-Write Lithography Systems Hsin-I Liu, Brian Richards, Avideh Zakhor, and Borivoje Nikolic Dept. of Electrical Engineering
More informationArithmetic Unit Based Reconfigurable Approximation Technique for Video Encoding
Arithmetic Unit Based Reconfigurable Approximation Technique for Video Encoding J.Jayakodi 1*, K.Sagadevan 2 1 ECE (Final year) IFET college of engineering, India. 2 Senior Assistant Professor, Department
More informationHigh-Efficient Parallel CAVLC Encoders on Heterogeneous Multicore Architectures
46 H. Y. SU, M. WEN, J. REN, N. WU, J. CHAI, C.Y. ZHANG, HIGH-EFFICIENT PARALLEL CAVLC ENCODER High-Efficient Parallel CAVLC Encoders on Heterogeneous Multicore Architectures Huayou SU, Mei WEN, Ju REN,
More informationChapter 10 Basic Video Compression Techniques
Chapter 10 Basic Video Compression Techniques 10.1 Introduction to Video compression 10.2 Video Compression with Motion Compensation 10.3 Video compression standard H.261 10.4 Video compression standard
More informationTechnical Note PowerPC Embedded Processors Video Security with PowerPC
Introduction For many reasons, digital platforms are becoming increasingly popular for video security applications. In comparison to traditional analog support, a digital solution can more effectively
More informationAbstract 1. INTRODUCTION. Cheekati Sirisha, IJECS Volume 05 Issue 10 Oct., 2016 Page No Page 18532
www.ijecs.in International Journal Of Engineering And Computer Science ISSN: 2319-7242 Volume 5 Issue 10 Oct. 2016, Page No. 18532-18540 Pulsed Latches Methodology to Attain Reduced Power and Area Based
More informationOut of order execution allows
Out of order execution allows Letter A B C D E Answer Requires extra stages in the pipeline The processor to exploit parallelism between instructions. Is used mostly in handheld computers A, B, and C A
More informationMULTI-CORE SOFTWARE ARCHITECTURE FOR THE SCALABLE HEVC DECODER. Wassim Hamidouche, Mickael Raulet and Olivier Déforges
2014 IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP) MULTI-CORE SOFTWARE ARCHITECTURE FOR THE SCALABLE HEVC DECODER Wassim Hamidouche, Mickael Raulet and Olivier Déforges
More informationISSCC 2006 / SESSION 14 / BASEBAND AND CHANNEL PROCESSING / 14.6
ISSCC 2006 / SESSION 14 / BASEBAND AND CHANNEL PROSSING / 14.6 14.6 A 1.8V 250mW COFDM Baseband Receiver for DVB-T/H Applications Lei-Fone Chen, Yuan Chen, Lu-Chung Chien, Ying-Hao Ma, Chia-Hao Lee, Yu-Wei
More informationLayout Decompression Chip for Maskless Lithography
Layout Decompression Chip for Maskless Lithography Borivoje Nikolić, Ben Wild, Vito Dai, Yashesh Shroff, Benjamin Warlick, Avideh Zakhor, William G. Oldham Department of Electrical Engineering and Computer
More informationHardware Implementation of Block GC3 Lossless Compression Algorithm for Direct-Write Lithography Systems
Hardware Implementation of Block GC3 Lossless Compression Algorithm for Direct-Write Lithography Systems Hsin-I Liu, Brian Richards, Avideh Zakhor, and Borivoje Nikolic Dept. of Electrical Engineering
More informationFast thumbnail generation for MPEG video by using a multiple-symbol lookup table
48 3, 376 March 29 Fast thumbnail generation for MPEG video by using a multiple-symbol lookup table Myounghoon Kim Hoonjae Lee Ja-Cheon Yoon Korea University Department of Electronics and Computer Engineering,
More informationMultimedia Communications. Image and Video compression
Multimedia Communications Image and Video compression JPEG2000 JPEG2000: is based on wavelet decomposition two types of wavelet filters one similar to what discussed in Chapter 14 and the other one generates
More informationDay 21: Retiming Requirements. ESE534: Computer Organization. Relative Sizes. Today. State. State Size
ESE534: Computer Organization Day 22: November 16, 2016 Retiming 1 Day 21: Retiming Requirements Retiming requirement depends on parallelism and performance Even with a given amount of parallelism Will
More informationVLSI Design: 3) Explain the various MOSFET Capacitances & their significance. 4) Draw a CMOS Inverter. Explain its transfer characteristics
1) Explain why & how a MOSFET works VLSI Design: 2) Draw Vds-Ids curve for a MOSFET. Now, show how this curve changes (a) with increasing Vgs (b) with increasing transistor width (c) considering Channel
More informationOL_H264e HDTV H.264/AVC Baseline Video Encoder Rev 1.0. General Description. Applications. Features
OL_H264e HDTV H.264/AVC Baseline Video Encoder Rev 1.0 General Description Applications Features The OL_H264e core is a hardware implementation of the H.264 baseline video compression algorithm. The core
More informationAn FPGA Implementation of Shift Register Using Pulsed Latches
An FPGA Implementation of Shift Register Using Pulsed Latches Shiny Panimalar.S, T.Nisha Priscilla, Associate Professor, Department of ECE, MAMCET, Tiruchirappalli, India PG Scholar, Department of ECE,
More informationHigh Performance Microprocessor Design and Automation: Overview, Challenges and Opportunities IBM Corporation
High Performance Microprocessor Design and Automation: Overview, Challenges and Opportunities Introduction About Myself What to expect out of this lecture Understand the current trend in the IC Design
More informationSlice-Balancing H.264 Video Encoding for Improved Scalability of Multicore Decoding
Slice-Balancing H.264 Video Encoding for Improved Scalability of Multicore Decoding Michael Roitzsch Technische Universität Dresden Department of Computer Science 01062 Dresden, Germany mroi@os.inf.tu-dresden.de
More informationAlain Legault Hardent. Create Higher Resolution Displays With VESA Display Stream Compression
Alain Legault Hardent Create Higher Resolution Displays With VESA Display Stream Compression What Is VESA? 2 Why Is VESA Needed? Video In Processor TX Port RX Port Display Module To Display Mobile application
More informationAn Efficient Reduction of Area in Multistandard Transform Core
An Efficient Reduction of Area in Multistandard Transform Core A. Shanmuga Priya 1, Dr. T. K. Shanthi 2 1 PG scholar, Applied Electronics, Department of ECE, 2 Assosiate Professor, Department of ECE Thanthai
More informationSharif University of Technology. SoC: Introduction
SoC Design Lecture 1: Introduction Shaahin Hessabi Department of Computer Engineering System-on-Chip System: a set of related parts that act as a whole to achieve a given goal. A system is a set of interacting
More informationOF AN ADVANCED LUT METHODOLOGY BASED FIR FILTER DESIGN PROCESS
IMPLEMENTATION OF AN ADVANCED LUT METHODOLOGY BASED FIR FILTER DESIGN PROCESS 1 G. Sowmya Bala 2 A. Rama Krishna 1 PG student, Dept. of ECM. K.L.University, Vaddeswaram, A.P, India, 2 Assistant Professor,
More informationQuality Assessment of the MPEG-4 Scalable Video CODEC
Quality Assessment of the MPEG-4 Scalable Video CODEC Florian Niedermeier, Michael Niedermeier, and Harald Kosch Department of Distributed Information Systems University of Passau (UoP) Passau, Germany
More informationResearch Topic. Error Concealment Techniques in H.264/AVC for Wireless Video Transmission in Mobile Networks
Research Topic Error Concealment Techniques in H.264/AVC for Wireless Video Transmission in Mobile Networks July 22 nd 2008 Vineeth Shetty Kolkeri EE Graduate,UTA 1 Outline 2. Introduction 3. Error control
More informationA VLIW Processor for Multimedia Applications
A VLIW Processor for Multimedia Applications E. Holmann T. Yoshida A. Yamada Y. Shimazu Mitsubishi Electric Corporation, System LSI Laboratory 4-1 Mizuhara, Itami, Hyogo 664, Japan Outline Objective System
More informationOptimization of memory based multiplication for LUT
Optimization of memory based multiplication for LUT V. Hari Krishna *, N.C Pant ** * Guru Nanak Institute of Technology, E.C.E Dept., Hyderabad, India ** Guru Nanak Institute of Technology, Prof & Head,
More informationChi Ching Chi, Ben Juurlink A QHD-capable parallel H.264 decoder
Powered by TCPDF (www.tcpdf.org) Chi Ching Chi, Ben Juurlink A QHD-capable parallel H.264 decoder Conference Object, Postprint version This version is available at http://dx.doi.org/1.14279/depositonce-634
More informationReduction of Area and Power of Shift Register Using Pulsed Latches
I J C T A, 9(13) 2016, pp. 6229-6238 International Science Press Reduction of Area and Power of Shift Register Using Pulsed Latches Md Asad Eqbal * & S. Yuvaraj ** ABSTRACT The timing element and clock
More informationYong Cao, Debprakash Patnaik, Sean Ponce, Jeremy Archuleta, Patrick Butler, Wu-chun Feng, and Naren Ramakrishnan
Yong Cao, Debprakash Patnaik, Sean Ponce, Jeremy Archuleta, Patrick Butler, Wu-chun Feng, and Naren Ramakrishnan Virginia Polytechnic Institute and State University Reverse-engineer the brain National
More informationScalable Foveated Visual Information Coding and Communications
Scalable Foveated Visual Information Coding and Communications Ligang Lu,1 Zhou Wang 2 and Alan C. Bovik 2 1 Multimedia Technologies, IBM T. J. Watson Research Center, Yorktown Heights, NY 10598, USA 2
More informationFeasibility Study of Stochastic Streaming with 4K UHD Video Traces
Feasibility Study of Stochastic Streaming with 4K UHD Video Traces Joongheon Kim and Eun-Seok Ryu Platform Engineering Group, Intel Corporation, Santa Clara, California, USA Department of Computer Engineering,
More informationDESIGN PHILOSOPHY We had a Dream...
DESIGN PHILOSOPHY We had a Dream... The from-ground-up new architecture is the result of multiple prototype generations over the last two years where the experience of digital and analog algorithms and
More informationAn Adaptive Technique for Reducing Leakage and Dynamic Power in Register Files and Reorder Buffers
An Adaptive Technique for Reducing Leakage and Dynamic Power in Register Files and Reorder Buffers Shadi T. Khasawneh and Kanad Ghose Department of Computer Science State University of New York, Binghamton,
More informationESE534: Computer Organization. Today. Image Processing. Retiming Demand. Preclass 2. Preclass 2. Retiming Demand. Day 21: April 14, 2014 Retiming
ESE534: Computer Organization Today Retiming Demand Folded Computation Day 21: April 14, 2014 Retiming Logical Pipelining Physical Pipelining Retiming Supply Technology Structures Hierarchy 1 2 Image Processing
More informationTools to Debug Dead Boards
Tools to Debug Dead Boards Hardware Prototype Bring-up Ryan Jones Senior Application Engineer Corelis 1 Boundary-Scan Without Boundaries click to start the show Webinar Outline What is a Dead Board? Prototype
More informationM598. Radeon E8860 (Adelaar) Video & Graphics PMC. Aitech
Single Width PMC PCI-X 64-bit @ 133 MHz Host Interface AMD Radeon E8860 (Adelaar) GPU 6 Independent Graphics Heads 2 GB GDDR5 Analog Inputs Analog and Digital Outputs Full Switching Capabilities Capture
More informationH.264/AVC Baseline Profile Decoder Complexity Analysis
704 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 13, NO. 7, JULY 2003 H.264/AVC Baseline Profile Decoder Complexity Analysis Michael Horowitz, Anthony Joch, Faouzi Kossentini, Senior
More informationA Fast Constant Coefficient Multiplier for the XC6200
A Fast Constant Coefficient Multiplier for the XC6200 Tom Kean, Bernie New and Bob Slous Xilinx Inc. Abstract. We discuss the design of a high performance constant coefficient multiplier on the Xilinx
More informationVideo compression principles. Color Space Conversion. Sub-sampling of Chrominance Information. Video: moving pictures and the terms frame and
Video compression principles Video: moving pictures and the terms frame and picture. one approach to compressing a video source is to apply the JPEG algorithm to each frame independently. This approach
More informationIMPLEMENTATION AND ANALYSIS OF FIR FILTER USING TMS 320C6713 DSK Sandeep Kumar
IMPLEMENTATION AND ANALYSIS OF FIR FILTER USING TMS 320C6713 DSK Sandeep Kumar Munish Verma ABSTRACT In most of the applications, analog signals are produced in response to some physical phenomenon or
More information1ms Column Parallel Vision System and It's Application of High Speed Target Tracking
Proceedings of the 2(X)0 IEEE International Conference on Robotics & Automation San Francisco, CA April 2000 1ms Column Parallel Vision System and It's Application of High Speed Target Tracking Y. Nakabo,
More informationMemory efficient Distributed architecture LUT Design using Unified Architecture
Research Article Memory efficient Distributed architecture LUT Design using Unified Architecture Authors: 1 S.M.L.V.K. Durga, 2 N.S. Govind. Address for Correspondence: 1 M.Tech II Year, ECE Dept., ASR
More informationLow Power Design: From Soup to Nuts. Tutorial Outline
Low Power Design: From Soup to Nuts Mary Jane Irwin and Vijay Narayanan Dept of CSE, Microsystems Design Lab Penn State University (www.cse.psu.edu/~mdl) ISCA Tutorial: Low Power Design Introduction.1
More information