Digilent Nexys-3 Cellular RAM Controller Reference Design Overview

Similar documents
FPGA Laboratory Assignment 4. Due Date: 06/11/2012

Design and Implementation of an AHB VGA Peripheral

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science Introductory Digital Systems Laboratory

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science Introductory Digital Systems Laboratory

ECE532 Digital System Design Title: Stereoscopic Depth Detection Using Two Cameras. Final Design Report

Design and implementation (in VHDL) of a VGA Display and Light Sensor to run on the Nexys4DDR board Report and Signoff due Week 6 (October 4)

Design and Implementation of Timer, GPIO, and 7-segment Peripherals

Design of VGA Controller using VHDL for LCD Display using FPGA

Block Diagram. 16/24/32 etc. pixin pixin_sof pixin_val. Supports 300 MHz+ operation on basic FPGA devices 2 Memory Read/Write Arbiter SYSTEM SIGNALS

EDA385 Bomberman. Fredrik Ahlberg Adam Johansson Magnus Hultin

Experiment: FPGA Design with Verilog (Part 4)

IMS B007 A transputer based graphics board

VID_OVERLAY. Digital Video Overlay Module Rev Key Design Features. Block Diagram. Applications. Pin-out Description

EEM Digital Systems II

Design and analysis of microcontroller system using AMBA- Lite bus

UNIVERSITY OF TORONTO JOÃO MARCUS RAMOS BACALHAU GUSTAVO MAIA FERREIRA HEYANG WANG ECE532 FINAL DESIGN REPORT HOLE IN THE WALL

Laboratory Exercise 7

Video Output and Graphics Acceleration

Digital Electronics II 2016 Imperial College London Page 1 of 8

Lab #5: Design Example: Keypad Scanner and Encoder - Part 1 (120 pts)

Spartan-II Development System

Lecture 14: Computer Peripherals

Design and Implementation of SOC VGA Controller Using Spartan-3E FPGA

Implementing SMPTE SDI Interfaces with Artix-7 FPGA GTP Transceivers Author: John Snow

EECS150 - Digital Design Lecture 12 - Video Interfacing. Recap and Outline

Block Diagram. dw*3 pixin (RGB) pixin_vsync pixin_hsync pixin_val pixin_rdy. clk_a. clk_b. h_s, h_bp, h_fp, h_disp, h_line

VGA Port. Chapter 5. Pin 5 Pin 10. Pin 1. Pin 6. Pin 11. Pin 15. DB15 VGA Connector (front view) DB15 Connector. Red (R12) Green (T12) Blue (R11)

Sequential Circuit Design: Principle

A CONTROL MECHANISM TO THE ANYWHERE PIXEL ROUTER

EE178 Lecture Module 4. Eric Crabill SJSU / Xilinx Fall 2005

IT T35 Digital system desigm y - ii /s - iii

TSIU03: Lab 3 - VGA. Petter Källström, Mario Garrido. September 10, 2018

T-COR-11 FPGA IP CORE FOR TRACKING OBJECTS IN VIDEO STREAM IMAGES Programmer manual

Digital Blocks Semiconductor IP

Lab 4: Hex Calculator

LogiCORE IP Spartan-6 FPGA Triple-Rate SDI v1.0

CSCB58 - Lab 4. Prelab /3 Part I (in-lab) /1 Part II (in-lab) /1 Part III (in-lab) /2 TOTAL /8

Reducing DDR Latency for Embedded Image Steganography

Using on-chip Test Pattern Compression for Full Scan SoC Designs

Video Graphics Array (VGA)

FPGA Design. Part I - Hardware Components. Thomas Lenzi

Traffic Light Controller

SignalTap: An In-System Logic Analyzer

LogiCORE IP AXI Video Direct Memory Access v5.01.a

Design of a Binary Number Lock (using schematic entry method) 1. Synopsis: 2. Description of the Circuit:

Lab Assignment 2 Simulation and Image Processing

1. Synopsis: 2. Description of the Circuit:

T1 Deframer. LogiCORE Facts. Features. Applications. General Description. Core Specifics

EECS150 - Digital Design Lecture 13 - Project Description, Part 3 of? Project Overview

Overview of BDM nc. The IEEE JTAG specification is also recommended reading for those unfamiliar with JTAG. 1.2 Overview of BDM Before the intr

EECS150 - Digital Design Lecture 10 - Interfacing. Recap and Topics

Laboratory Exercise 4

Testing Results for a Video Poker System on a Chip

VHDL Design and Implementation of FPGA Based Logic Analyzer: Work in Progress

Implementation of UART with BIST Technique

Modeling Digital Systems with Verilog

[Krishna*, 4.(12): December, 2015] ISSN: (I2OR), Publication Impact Factor: 3.785

COMPUTER ENGINEERING PROGRAM

DEDICATED TO EMBEDDED SOLUTIONS

TV Character Generator

TV Synchronism Generation with PIC Microcontroller

Synchronous Sequential Logic

Laboratory 4. Figure 1: Serdes Transceiver

Advanced Training Course on FPGA Design and VHDL for Hardware Simulation and Synthesis. 26 October - 20 November, 2009

Spartan-II Development System

ENGG2410: Digital Design Lab 5: Modular Designs and Hierarchy Using VHDL

Authentic Time Hardware Co-simulation of Edge Discovery for Video Processing System

EE178 Spring 2018 Lecture Module 5. Eric Crabill

BUSES IN COMPUTER ARCHITECTURE

Laboratory 1 - Introduction to Digital Electronics and Lab Equipment (Logic Analyzers, Digital Oscilloscope, and FPGA-based Labkit)

CHAPTER 6 ASYNCHRONOUS QUASI DELAY INSENSITIVE TEMPLATES (QDI) BASED VITERBI DECODER

Hello and welcome to this training module for the STM32L4 Liquid Crystal Display (LCD) controller. This controller can be used in a wide range of

SPI Serial Communication and Nokia 5110 LCD Screen

Tutorial 11 ChipscopePro, ISE 10.1 and Xilinx Simulator on the Digilent Spartan-3E board

Lab 3: VGA Bouncing Ball I

LogiCORE IP Video Timing Controller v3.0

CPE 329: Programmable Logic and Microprocessor-Based System Design

Block Diagram. deint_mode. line_width. log2_line_width. field_polarity. mem_start_addr0. mem_start_addr1. mem_burst_size.

SignalTap Plus System Analyzer

LFSRs as Functional Blocks in Wireless Applications Author: Stephen Lim and Andy Miller

Research on Precise Synchronization System for Triple Modular Redundancy (TMR) Computer

SOC Implementation for Christmas Lighting with Pattern Display Indication RAMANDEEP SINGH 1, AKANKSHA SHARMA 2, ANKUR AGGARWAL 3, ANKIT SATIJA 4 1

LogiCORE IP Spartan-6 FPGA Triple-Rate SDI v1.0

Group 1. C.J. Silver Geoff Jean Will Petty Cody Baxley

AD9884A Evaluation Kit Documentation

FPGA Design with VHDL

Pivoting Object Tracking System

Field Programmable Gate Array (FPGA) Based Trigger System for the Klystron Department. Darius Gray

V6118 EM MICROELECTRONIC - MARIN SA. 2, 4 and 8 Mutiplex LCD Driver

Laboratory Exercise 7

GENERAL RULES FOR EE314 PROJECTS

FPGA-BASED EDUCATIONAL LAB PLATFORM

LogiCORE IP AXI Video Direct Memory Access v5.03a

6.S084 Tutorial Problems L05 Sequential Circuits

VGA Controller. Leif Andersen, Daniel Blakemore, Jon Parker University of Utah December 19, VGA Controller Components

Laboratory 4 Check Off Sheet. Student Name: Staff Member Signature/Date: Part A: VGA Interface You must show a TA the following for check off:

Sequential Circuit Design: Part 1

EITF35: Introduction to Structured VLSI Design

Memory Interfaces Data Capture Using Direct Clocking Technique Author: Maria George

Block Diagram. pixin. pixin_field. pixin_vsync. pixin_hsync. pixin_val. pixin_rdy. pixels_per_line. lines_per_field. pixels_per_line [11:0]

Transcription:

Digilent Nexys-3 Cellular RAM Controller Reference Design Overview General Overview This document describes a reference design of the Cellular RAM (or PSRAM Pseudo Static RAM) controller for the Digilent Nexys-3 development board with Spartan-6 FPGA. The reference design at hand is a video pipeline with a soft-core processor control. The video part of the design displays an 800x600 pixel image stored in the PSRAM. The processor has full access to the PSRAM memory, as well as to PS/2 keyboard interface and 7-segment LED indicators. The aim of this design is to demonstrate the usage of the PSRAM memory on a specific development board. It is not intended to be educational in any way, nor does it pretend to be very efficient or fault-free, again, in any way. Design features: VESA 800x600 @ 60Hz video standard, 8-bit colour depth Synchronous single and burst PSRAM access with fixed latency @ 50MHz Dual-port frontend for the PSRAM memory controller picoblaze soft-core processor PS/2 keyboard interface with software access 7-segment LED driver with software access Verilog HDL (PS/2 interface is in VHDL) Functional overview This design is comprised of three distinct parts: the PSRAM memory controller (with a separate dualport frontend), the video pipeline (with VGA timing generator) and the picoblaze soft-core processor (with peripherals): The memory controller provides a simple asynchronous memory access to the application by taking care of all timing and signalling considerations. The attached dual-port frontend allows two separate applications to use the memory controller without considering each other (read further to see how this can actually negatively affect the result). The video pipeline reads portions of image from PSRAM memory and buffers them in local block-ram (BRAM) memory. The VGA timing generator reads the pixel data from this buffer and sends them to VGA screen. True dual-port nature of BRAM allows these two components to work simultaneously. The processor can access (read and write) any memory location, hence it has the ability to read and modify the image stored in the PSRAM. The processor also has access to two peripheral devices the PS/2 keyboard interface and 7-segment LED indicators. Detailed overview Video pipeline The video standard used in this design is VESA 800x600 @ 60Hz. It means the image size is 800 by 600 pixels, and the image on the screen is refreshed 60 times per second. It provides a good enough Page 1 of 5

resolution and quality for many applications. This standard requires, that the pixel clock frequency would be 40MHz, which is a very convenient number because the on-chip digital clock manager (DCM) can produce this frequency (exactly) from the reference 100MHz, provided on-board. The external PSRAM memory is of a pseudo-static type, capable of operating asynchronously. Its access interface in this mode is very simple; however the access time is at least 70ns, which is equivalent to just over 14MHz. Even though two pixels (assuming 8-bit colour depth) can be stored in each memory word, it is by far insufficient to match the 40MHz pixel clock requirement of the VGA timing generator. The memory is therefore used in synchronous mode, in which it can operate at a frequency of up to 80MHz. In synchronous mode the memory can be accessed (read or written) either in single on burst modes, where single-access is a variation of burst-access with burst length equal to one. Burst mode allows to access up to 128 memory cells sequentially one cell per clock cycle, without having to reinitiate the operation. The first address is provided during read or write request; it is then being incremented automatically by the memory. Although all PSRAM memory cells are addressed linearly, the memory is arranged in rows; each row contains 128 cells. Burst-access can only last until the end of a row, irrespectively of where it started. If the application is continuing a burst-access over the end of a row, the operation will abort at the end of the row and a new operation will have to be initiated to continue. The PSRAM controller provided in this design does not check for the end-of-row condition. It is up to the application to make sure that burst-accesses do not exceed ends of rows. Failure to do so will result in the controller entering an undefined state, from where it may possibly not recover unless restarted. Each memory word is 16 bits wide. The Nexys-3 board has only 8 colour signals (3 red, 3 green and 2 blue), so it is rational to store two pixel colour values in every memory cell. Such arrangement also allows speeding up reads and writes if operating on 8-bit data. However, computing pixel addresses becomes slightly more complex. In addition, writing a single 8-bit datum is not so straightforward any more. There are two options to do this (actually, you would always choose just one of them): first read a memory cell, update the corresponding byte and write it back use additional LB (lower byte enable) and UB (upper byte enable) memory signals to indicate which byte you wish to write Even with burst-access it is not possible to read the memory continuously indefinitely. We must, however, provide an uninterrupted stream of pixel data to the VGA generator. One of the ways to do so is to read the memory on average at a faster rate than the VGA generator reads pixel colour values. In this case a buffer is required between the memory and the VGA generator. This scheme is implemented in this design. The memory frontend of the video pipeline is reading the memory in burst mode at a frequency of 50MHz, two pixels at a time, and stores them into on-chip block-ram memory. The VGA frontend of the video pipeline is using the second port of BRAM buffer to read pixel colour values at a frequency of 40MHz, two at a time, and sends the corresponding byte to the screen. In order for this arrangement to run properly, the whole buffer is filled with pixel data at system start-up. Then every time the VGA frontend of the video pipeline reads the middle or the last word Page 2 of 5

from the buffer, the memory frontend initiates a burst read to fill the recently read half of the buffer with fresh pixel colour values. The address counters of the two frontends are synchronized, and because the memory frontend is operating at a faster frequency than the VGA frontend (50MHz vs. 40Mhz) and is fetching two pixel colour values at a time, this arrangement functions correctly and leaves certain time slots for the second application (the processor) to access the memory. As mentioned previously, the image size is 800 x 600 pixels with an 8-bit colour depth. Each memory cell contains colour data for two pixels. The video pipeline is reading the memory starting from address 0 to address (800 x 600 / 2 = 240000) in decimal. Make sure you set proper offsets when preloading the memory. Memory controller The dual-port frontend for the memory controller provided in this design has a very simple (and limited) principle of operation. If two applications (the processor and the video pipeline) are requesting memory access simultaneously, it will give priority to the video pipeline. No memory access will be given to any application until the current operation (if such exists) is finished. The user must make sure that the second application does not monopolize the memory by long subsequent burst operations, otherwise the video pipeline will starve for pixel colour data and the picture on the screen will degrade. The memory controller provides a simple asynchronous memory access interface to the application. The application must issue a read or write request signal and the address and wait for op_begun signal, after which the application may take down the request signals and address. Operation is finished when op_done signal is asserted. In case of a read, data is available on the next clock cycle after data_ok assertion. In case of a write, data must be provided on the next clock cycle after data_ok assertion. If case of a burst-access the burst signal must be asserted after data_ok goes high and kept high until the end of burst. Of course, data must be read in or sent out every next clock cycle. Page 3 of 5

Processor application From the software point of view, the processor has access to any memory location. It cannot be done with a single instruction, however. The program must provide a 23-bit address, a 16-bit data + 4 control signals (read strobe, write strobe, upper byte enable, lower byte enable) to the memory in order to access it. This is done by writing to certain registers which are associated with output port of the processor. The mapping is as follows: Port ID Register 0 WR data [7:0] RD data [7:0] 1 WR data [15:8] RD data [15:8] 2 WR address [7:0] RD 3 WR address [15:8] RD 4 WR address [22:16] RD 5 WR control signals RD 6 WR operation done RD operation done 7 WR 7-segment LEDs 1,2 RD scan code 8 WR 7-segment LEDs 3,4 RD As soon as a read or write strobe bit appears in the control register, the FSM initiates the corresponding operation. These control bits are then cleared automatically. When the operation is complete, the FSM writes 0x01 to register op_done. This register must be cleared by the program. It may therefore take quite some time to access just a single memory cell first, registers need to be loaded with the corresponding values (6 in the worst case), they need to be sent out (again, all 6 in the worst case). In case of a read operation one must also monitor the op_done register in a loop, and then read the data in from two external registers. Memory access latency must also be added. Therefore, from the software point of view, a read operation takes about 40 clock cycles to complete. In a similar fashion, i.e. through registers mapped to processor ports, software can control the four 7- segment LED indicators. They are configured to support only hexadecimal digits. The PS/2 interface controller constantly monitors keyboard activity, registers the message, extracts the information byte and generates an interrupt to the processor. When the processor receives the interrupt, the program can retrieve the code from port 7. When a key is pressed the keyboard transmits its respective code. When the key is released the keyboard transmits 0xF0 followed by the key s code. For certain keys, called extended, 0xE0 precedes the keys codes. This means that the processor will receive 3 or 5 interrupts for each key. Situation will change if a single key is held down pressed or if a key is used in combination with another key, e.g. shift + a. Page 4 of 5

Known issues 1. There appears to be a 1 cycle data delay after the VGA generator is enabled. This is possibly due to the 16-bit -> 8-bit buffer register. 2. Sometimes certain pixels on the screen seem to blink or change colour. Dunno why. Possible improvements 1. Improve dual-port frontend priority management. 2. Add end-of-row check for PSRAM controller. Comments on implementation 1. The design was implemented using Xilinx ISE 12.3 2. In project file hierarchy you may see that module BSCAN_BLOCK_inst is missing. You may safely continue without it. 3. Synthesis will warn that Input <instruction<0:11>> is never used. This appears to be a bug but not in the design. You may safely ignore this warning. 4. Synthesis will warn that Node <input_buffer_0> is unconnected. This is a small coding issue. You may safely ignore this warning. 5. Implementation will warn that read_strobe_flop has unconnected output pin. This is because this pin is not used in this design. You may safely ignore this warning. 6. Implementation will warn that k_write_strobe_flop has unconnected output pin. This is because this pin in not used in this design. You may safely ignore this warning. 7. Implementation will warn that interrupt_ack_flop has unconnected output pin. This is because this pin is not used in this design. You may safely ignore this warning. Feedback Please send your feedback/bug reports to vadim.pesonen@ati.ttu.ee Page 5 of 5