Block Diagram. RGB or YCbCr. pixin_vsync. pixin_hsync. pixin_val. pixin_rdy. clk

Similar documents
VID_OVERLAY. Digital Video Overlay Module Rev Key Design Features. Block Diagram. Applications. Pin-out Description

Block Diagram. dw*3 pixin (RGB) pixin_vsync pixin_hsync pixin_val pixin_rdy. clk_a. clk_b. h_s, h_bp, h_fp, h_disp, h_line

Block Diagram. pixin. pixin_field. pixin_vsync. pixin_hsync. pixin_val. pixin_rdy. pixels_per_line. lines_per_field. pixels_per_line [11:0]

Block Diagram. 16/24/32 etc. pixin pixin_sof pixin_val. Supports 300 MHz+ operation on basic FPGA devices 2 Memory Read/Write Arbiter SYSTEM SIGNALS

Block Diagram. deint_mode. line_width. log2_line_width. field_polarity. mem_start_addr0. mem_start_addr1. mem_burst_size.

FPGA Laboratory Assignment 4. Due Date: 06/11/2012

Spartan-II Development System

Design and Implementation of an AHB VGA Peripheral

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science Introductory Digital Systems Laboratory

VIDEO 2D SCALER. User Guide. 10/2014 Capital Microelectronics, Inc. China

Lab # 9 VGA Controller

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science Introductory Digital Systems Laboratory

UG0651 User Guide. Scaler. February2018

Serial FIR Filter. A Brief Study in DSP. ECE448 Spring 2011 Tuesday Section 15 points 3/8/2011 GEORGE MASON UNIVERSITY.

2D Scaler IP Core User s Guide

LogiCORE IP Video Timing Controller v3.0

VHDL Design and Implementation of FPGA Based Logic Analyzer: Work in Progress

Testing Results for a Video Poker System on a Chip

Design and implementation (in VHDL) of a VGA Display and Light Sensor to run on the Nexys4DDR board Report and Signoff due Week 6 (October 4)

OL_H264e HDTV H.264/AVC Baseline Video Encoder Rev 1.0. General Description. Applications. Features

Design and Implementation of SOC VGA Controller Using Spartan-3E FPGA

Single Channel LVDS Tx

Digital Blocks Semiconductor IP

Upgrading a FIR Compiler v3.1.x Design to v3.2.x

Figure 1: Feature Vector Sequence Generator block diagram.

Polar Decoder PD-MS 1.1

Inside Digital Design Accompany Lab Manual

LogiCORE IP Video Timing Controller v3.0

Tutorial 11 ChipscopePro, ISE 10.1 and Xilinx Simulator on the Digilent Spartan-3E board

LogiCORE IP AXI Video Direct Memory Access v5.01.a

LogiCORE IP Spartan-6 FPGA Triple-Rate SDI v1.0

EEM Digital Systems II

Digital Blocks Semiconductor IP

Dynamically Reconfigurable FIR Filter Architectures with Fast Reconfiguration

T1 Deframer. LogiCORE Facts. Features. Applications. General Description. Core Specifics

Video and Image Processing Suite

EE178 Spring 2018 Lecture Module 5. Eric Crabill

Viterbi Decoder User Guide

LogiCORE IP Video Scaler v5.0

Radar Signal Processing Final Report Spring Semester 2017

Week 5 Dr. David Ward Hybrid Embedded Systems

Design & Simulation of 128x Interpolator Filter

ECE532 Digital System Design Title: Stereoscopic Depth Detection Using Two Cameras. Final Design Report

LFSRs as Functional Blocks in Wireless Applications Author: Stephen Lim and Andy Miller

A Novel Macroblock-Level Filtering Upsampling Architecture for H.264/AVC Scalable Extension

Modeling Latches and Flip-flops

Lab #5: Design Example: Keypad Scanner and Encoder - Part 1 (120 pts)

3. Sequential Logic 1

Graduate Institute of Electronics Engineering, NTU Digital Video Recorder

Chapter 3: Sequential Logic

Reconfigurable Architectures. Greg Stitt ECE Department University of Florida

EE178 Lecture Module 4. Eric Crabill SJSU / Xilinx Fall 2005

L11/12: Reconfigurable Logic Architectures

ECT 224: Digital Computer Fundamentals Digital Circuit Simulation & Timing Analysis

FPGA Implementation of DA Algritm for Fir Filter

OF AN ADVANCED LUT METHODOLOGY BASED FIR FILTER DESIGN PROCESS

Main Design Project. The Counter. Introduction. Macros. Procedure

LogiCORE IP CIC Compiler v2.0

DT3162. Ideal Applications Machine Vision Medical Imaging/Diagnostics Scientific Imaging

Design and analysis of microcontroller system using AMBA- Lite bus

L12: Reconfigurable Logic Architectures

Efficient implementation of a spectrum scanner on a software-defined radio platform

Digilent Nexys-3 Cellular RAM Controller Reference Design Overview

T-COR-11 FPGA IP CORE FOR TRACKING OBJECTS IN VIDEO STREAM IMAGES Programmer manual

CHAPTER 6 DESIGN OF HIGH SPEED COUNTER USING PIPELINING

Video and Image Processing Suite User Guide

Reconfigurable FPGA Implementation of FIR Filter using Modified DA Method

A Fast Constant Coefficient Multiplier for the XC6200

Digital Blocks Semiconductor IP

OL_H264MCLD Multi-Channel HDTV H.264/AVC Limited Baseline Video Decoder V1.0. General Description. Applications. Features

Modeling Latches and Flip-flops

ECSE-323 Digital System Design. Datapath/Controller Lecture #1

IP-DDC4i. Four Independent Channels Digital Down Conversion Core for FPGA FEATURES. Description APPLICATIONS HARDWARE SUPPORT DELIVERABLES

SparkFun Camera Manual. P/N: Sense-CCAM

Memory Interfaces Data Capture Using Direct Clocking Technique Author: Maria George

EECS 578 SVA mini-project Assigned: 10/08/15 Due: 10/27/15

A CONTROL MECHANISM TO THE ANYWHERE PIXEL ROUTER

EN2911X: Reconfigurable Computing Topic 01: Programmable Logic. Prof. Sherief Reda School of Engineering, Brown University Fall 2014

Main Design Project. The Counter. Introduction. Macros. Procedure

SingMai Electronics PT55. Advanced Composite Video Interface: Encoder IP Core. User Manual. Revision th November 2016

TSIU03: Lab 3 - VGA. Petter Källström, Mario Garrido. September 10, 2018

Commsonic. Satellite FEC Decoder CMS0077. Contact information

Authentic Time Hardware Co-simulation of Edge Discovery for Video Processing System

FPGA Hardware Resource Specific Optimal Design for FIR Filters

1. Synopsis: 2. Description of the Circuit:

UG0682 User Guide. Pattern Generator. February 2018

Lab 3: VGA Bouncing Ball I

Keywords Xilinx ISE, LUT, FIR System, SDR, Spectrum- Sensing, FPGA, Memory- optimization, A-OMS LUT.

Cyclone II EPC35. M4K = memory IOE = Input Output Elements PLL = Phase Locked Loop

Digital Blocks Semiconductor IP

Memory Efficient VLSI Architecture for QCIF to VGA Resolution Conversion

HD66840/HD LVIC/LVIC-II (LCD Video Interface Controller) Description. Features

CAD for VLSI Design - I Lecture 38. V. Kamakoti and Shankar Balachandran

March 13, :36 vra80334_appe Sheet number 1 Page number 893 black. appendix. Commercial Devices

Fast Fourier Transform v4.1

FPGA Development for Radar, Radio-Astronomy and Communications

2.6 Reset Design Strategy

Spartan-II Development System

Design of a Binary Number Lock (using schematic entry method) 1. Synopsis: 2. Description of the Circuit:

Laboratory 4. Figure 1: Serdes Transceiver

Transcription:

Rev. 3. Synthesizable, technology dependent IP Core for FPGA, ASIC and SoC Fully programmable scale parameters Fully programmable RGB channel widths allow support for any RGB format (or greyscale if only one channel is used) Features a 5x5-tap polyphase filter the x and y dimensions with unique phases Fully programmable filter coefficients to suit the desired application Example general purpose 'Lanczos' filter coefficients shipped with the design. Different coefficient sets available on request Output rate is x pixel per clock for scalg factors > Generates one scaled output frame for every put frame No frame buffer required Supports 5MHz+ operation on basic FPGA devices pix_rdy Studio quality dynamic real-time video scalg Conversion of all standard and custom video resolutions such as HD7P to HD8P, to VGA etc. Support for the latest generation video formats with resolutions of 4K and above scalg for flat panel displays, portable devices, video consoles, video format converters, set-top boxes, digital TV etc. Picture--Picture (PiP) applications Generic Parameters Generic name Type Valid range dw RGB channel width teger le_width Width of lestores pixels teger 4 < pixels < log_le_width Log of lestore width teger log(le_width) 5-TAP POLYPHASE FILTER COEFFICENT ROM COEFFICENT ROM HORIZONTAL SCALER VERTICAL SCALER 4 4 Figure : scaler architecture P name I/O Active state Synchronous clock risg edge reset Asynchronous reset low scale_pitch_x [3:] / (x scale factor) Specified as an unsigned number [4 ] format scale_pitch_y [3:] / (y scale factor) Specified as an unsigned number [4 ] format put_ppl [5:] Number of pixels per le the source put frame -bit number) put_lpf [5:] Number of les per frame the source put frame -bit number) output_ppl [5:] Number of pixels per le the scaled output frame -bit number) output_lpf [5:] Number of les per frame the scaled output frame -bit number) Xilx 7-series used as a benchmark Copyright 7 www.zipcores.com 5-TAP POLYPHASE FILTER P-out Applications pix_val output_ppl pix_hsync put_lpf Fully pipeled architecture with simple flow control put_ppl pixout scale_pitch_y Supports all video resolutions up to x pixels dw*3 pix_vsync tap le_width dw*3 pix RGB or YCbCr tap log_le_width 4-bit accumulator with 4-bit scale-pitch LINE BUFFER PIXEL BUFFER RGB or YCbCr Versatile RGB (or YCbCr 444) video scaler capable of scalg up or down by any factor tap Supplied as human readable VHDL (or Verilog) source code tap reset scale_pitch_x Block Diagram output_lpf Key Design Features Page of 5

Rev. 3. P-out cont... P name I/O Active state pix [dw*3 - :] RGB pixel pix_vsync Vertical sync (Cocident with first pixel of put frame) pix_hsync Horizontal sync (Cocident with first pixel of put le) pix_val Input pixel valid pix_rdy out Ready to accept put pixel (Handshake signal) pixout [dw*3 - :] out RGB pixel out out Vertical sync out (Cocident with first pixel of output frame) out Horizontal sync out (Cocident with first pixel of output le) out Output pixel valid Ready to accept output pixel (Handshake signal) In addition the user must also specify the exact resolution of the source put frame and the scaled output frame usg the parameters: put_ppl, put_lpf, output_ppl and output_lpf. The followg tables give a list of generic parameters required for the conversion of some example video formats. SCALE UP IN OUT Pitch Pitch I/P X Y I/P VGA 4x48 8x 377 377 4 48 8 8x 4x78 3 3 8 4 78 4x78 HD8 9x8 84 93 4 78 9 8 S 8x4 K 48x8 5 3884 8 4 48 8 Pitch Pitch I/P X Y I/P SCALE DOWN IN OUT General The IP Core is a studio quality video scaler capable of generatg terpolated output images from x up to x pixels resolution. The architecture permits seamless scalg (either up or down) dependg on the chosen scale factor. Internally, the scaler uses a 4-bit accumulator and a bank of polyphase FIR filters with phases or terpolation pots. All filter coefficients are programmable, allowg the user to defe a wide range of filter characteristics. Pixels flow to and out of the video scaler accordance with a simple valid-ready streamg protocol. Pixels are transferred to the scaler on a risg clock-edge when pix_val and pix_rdy are both active. Likewise, pixels are transferred out of the scaler on a risg clock-edge when and are both active. As such, the pipele protocol allows both put and output terfaces to be stalled dependently. The scaler is partitioned to a horizontal scalg module series with a vertical scalg module as shown by Figure. Scale pitch, pixels per le and les per frame The output resolution of the scaled output image is controlled by the generic parameters scale_pitch_x, scale_pitch_y, put_ppl, put_lpf, output_ppl and output_lpf. The scale pitch may be calculated usg the followg formula: pitch = ( As an example, consider the scalg of VGA format video (4x48) to format video (4x78). In this case the scale pitch the x and y dimensions would be.5. As the value must be specified as a.bit number the actual scale pitch must be multiplied by givg the value '5'. 8x VGA 4x48 5 5 8 4 48 4x78 8x 543 543 4 78 8 HD8 9x8 4x78 78 57 9 8 4 78 K 48x8 S 8x4 554 43 48 8 8 4 Flow control Pixels flow and out of the video scaler accordance with the validready pipele protocol. The scalg operation occurs on a le-by-le basis with the signal pix_hsync specifyg the start of a new le and pix_vsync specifyg the start of a new frame. All pixels to the scaler (cludg pix_vsync and pix_hsync) must be qualified by the pix_val signal asserted, otherwise changes to the put signals will be ignored. Note that the first pixel of a new frame is accompanied by a valid vsync and hsync. The first pixel a new le is accompanied by hsync only. On receipt of the first vsync, the scalg operation begs and output pixels are generated accordance with the chosen scale parameters. Generally, for scale-down (decimation) operations, the put terface will not stall. Conversely, for scale-up (terpolation) the number of output pixels will be greater than the number of put pixels. This will result the occasional stallg of the put due to the change ratio. Input resolution ) Output resolution Copyright 7 www.zipcores.com See Zipcores application note: app_note_zc.pdf for more examples of how to use the valid-ready pipele/streamg protocol Page of 5

Rev. 3. Loadg of scale parameters Functional Timg The scale parameters are fully programmable and allow the put video to be scaled differently on a frame-by-frame basis. With careful design, the architecture also permits different video sources to be multiplexed to the same scaler with different scalg parameters. Figure 4 shows the signallg at the put to the scaler at the start of a new frame. The first le of a new frame begs with pix_vsync and pix_hsync asserted together with the first pixel. Note that the signals pix, pix_vsync and pix_hsync are only valid if pix_val is also asserted. In addition, the diagram shows what happens when pix_rdy is de-asserted. In this case, the pipele is stalled and the upstream terface must hold-off before further pixels are processed. Parameters are updated contuously on every risg clock edge and must rema stable durg the scalg operation. When programmg new scale parameters (e.g. due to a change of video mode) it is necessary to assert the system reset signal for at least one clock cycle to avoid any possible corruption the output video. This is often convenient to do durg the vertical blankg period of an put video frame when there are no active pixels. After reset the scaler will lock to the next clean put frame before the scalg operation contues. Pipele stall pix Pixel Pixel Pixel Pixel 3 Pixel 4 pix_vsync Scalg algorithm pix_hsync The scaler uses a 5-tap polyphase filter with phases both the x and y dimensions. By default, both the x and y filter kernels use a coefficient set sampled from the Lanczos function (Figure ). pix_val pix_rdy Start of new frame. Figure 4: First le of a new put frame - also showg pipele stall - - tap tap Figure : Lanczos wdowed-sc function - filter tap positiong Figure 5 shows the signallg at the output of the scaler. The output uses exactly the same protocol as the put. Each new output le begs with and asserted. In this particular example, it shows de-asserted for clock-cycle, which case, the output pixel should be ignored. Remember that transfers at a valid-ready terface are only permitted when valid and ready are both simultaneously. Pixel valid - ignore Figure 3, below shows how the phase changes relative to the pixel taps durg the scalg operation. Dependg on the fractional part of the accumulator, different weights are given to the pixel taps when generatg the terpolated output pixels. pixout Pixel Pixel Pixel Pixel 3 Pixel 4 Phase function position. Phase 5 function position Phase function position... - tap Start of new output frame tap Figure 5: First le of a new output frame also showg valid output pixel Figure 3: The -phases of the 5-tap filter Different filter kernels can generate slightly different results. Example scripts are provided to generate: Lanczos, Lanczos3, Hammg and Kaiser coefficient sets. Alternatively, the user may choose to generate their own coefficient sets3. 3 See Zipcores application note: app_note_zc3.pdf for examples of how to generate different coefficient sets Copyright 7 www.zipcores.com Page 3 of 5

Rev. 3. The file video_.txt follows a simple format which defes the state of signals: pix_val, pix_vsync, pix_hsync and pix on a clock-by-clock basis. An example file might be the followg: Source File All source files are provided as text files coded VHDL. The followg table gives a brief description of each file. Source file video_.txt Text-based source video file video_file_reader.vhd Reads text-based source video file pipele_reg.vhd Pipeled register element pipele_shovel.vhd Pipeled 'shovel' register ram_dp_w_r.vhd Dual port RAM component fifo_sync.vhd Synchronous FIFO x_buffer.vhd Pixel put buffer/shift register x_filter_pack.vhd Package contag x-filter coefficients x_filter_polyphase.vhd Horizontal scaler output pixel filter x_scaler.vhd Horizontal scaler component y_buffer.vhd Le buffer y_filter_pack.vhd Package contag y-filter coefficients y_filter_polyphase.vhd Vertical scaler output pixel filter y_scaler.vhd Vertical scaler component xy_reg.vhd scaler put registers scaler top-level component xy_scaler_bench.vhd Top-level test bench.. 33 44 55 77 88 # pixel le (start of frame) # pixel # don't care! # pixel # pixel le etc.. In this example, the first le of of the video_.txt file asserts the put signals pix_val =, pix_vsync =, pix_hsync = and pix = x. The simulation must be run for at least ms durg which time an output text file called video_out.txt will be generated. This file contas a sequential list of 4-bit output pixels the same format as video_.txt. The example provided scales a 78x57 source test pattern by a factor of.833 the x and y dimensions to give a VGA output image of 4x48 pixels. Figure shows the resultg image from the test. Functional Testg An example VHDL testbench is provided for use a suitable VHDL simulator. The compilation order of the source code is as follows:.. 3. 4. 5.. 7. 8. 9.... 3. 4. 5.. video_file_reader.vhd pipele_reg.vhd pipele_shovel.vhd ram_dp_w_r.vhd fifo_sync.vhd x_buffer.vhd x_filter_pack.vhd x_filter_polyphase.vhd x_scaler.vhd y_buffer.vhd y_filter_pack.vhd y_filter_polyphase.vhd y_scaler.vhd xy_reg.vhd xy_scaler_bench.vhd Figure : Output frame from the hardware simulation example (Scale-down of 78x57 to 4x48) Performance The Digital Scaler was tested with a large number of scale factors to verify correct operation and to observe the quality of the output video. The true defition and quality is difficult to show with the limitations of this document, however, example images can be provided on request. The VHDL testbench stantiates the component and the user may modify the generic parameters order to generate the desired scaled output image. The source video for the simulation is generated by the video file-reader component. This component reads a text-based file which contas the RGB pixel. The text file is called video_.txt and should be placed the top-level simulation directory. Copyright 7 www.zipcores.com The video scaler was also verified usg the Zipcores ZIP-HDV- development board featurg a Xilx Spartan FPGA. The photo Figure 7 demonstrates the scale down of a PAL source image to a small custom video wdow of 5x4 pixels on an S (8x4) background. Page 4 of 5

Rev. 3. XILINX 7-SERIES FPGAS Resource type Artix-7 Ktex-7 Virtex-7 Slice Register 4 43 43 Slice LUTs 8 393 388 Block RAM DSP48 8 8 8 Occupied Slices Clock freq. (approx) 787 9 97 MHz 5 MHz 35 MHz Revision History Revision Change description Date. Initial revision 5//9. Mor changes to the video_.txt and video_out.txt file formats 3//9. Moved scale parameters from generics to ports /3/9.3 Added extra items to key features //9.4 Updated synthesis results 5//9. Added scalg formula Updated source file descriptions to clude shovels. Updated synthesis results 7//.7 Improved block diagram and pout descriptions 4/8/.8 Updated synthesis results le with source 8/5/ code changes. Major revision. Simplified loadg of scale /5/3 parameters. Modified architecture to support one frame out for one frame. Moved to -bit scale parameters to support resolutions up to x /8/4. Changed y-scaler to use full 5-le (5-tap) filter. Updated synthesis results for Xilx 7-series. /8/5 3. Major revision. Modified design to support any RGB channel width e.g. RGB 8:8:8, RGB:: etc. 3//7 Figure 7: Scaler demo lab setup (Generation of a small 5x4 video wdow on S background) Synthesis The files required for synthesis and the design hierarchy is shown below: xy_reg.vhd pipele_reg.vhd x_scaler.vhd pipele_shovel.vhd x_buffer.vhd x_filter_polyphase.vhd pipele_reg.vhd y_scaler.vhd pipele_shovel.vhd y_buffer.vhd ram_dp_w_r.vhd fifo_sync.vhd pipele_reg.vhd y_filter_polyphase.vhd pipele_reg.vhd The VHDL core is designed to be technology dependent. However, as a benchmark, synthesis results have been provided for the Xilx 7-series FPGAs. Synthesis results for other FPGAs and technologies can be provided on request. Fixg the scale parameters at the scaler put will result the most optimum scaler design. In addition, the speed of the design may be improved by tyg the signal low. This may be possible if the designer knows that the pipele downstream of the scaler will always be able to accept output pixels. Careful attention must be made to the width of the le stores as this will effect the amount of RAM resource used the design. For sgle channel (greyscale) operation then the user may use only one of the RGB channels and tie the other channel puts to zero. This will result further resource savgs with the other channels optimized away durg synthesis. Trial synthesis results are shown with the generic parameters set to: dw = 8, le_width = 4 and log_le_width =. Resource usage is specified after place and route of the design. Copyright 7 www.zipcores.com Page 5 of 5