EDA385 Bomberman. Fredrik Ahlberg Adam Johansson Magnus Hultin

Similar documents
ECE532 Digital System Design Title: Stereoscopic Depth Detection Using Two Cameras. Final Design Report

Digilent Nexys-3 Cellular RAM Controller Reference Design Overview

Design and implementation (in VHDL) of a VGA Display and Light Sensor to run on the Nexys4DDR board Report and Signoff due Week 6 (October 4)

Design and Implementation of an AHB VGA Peripheral

Design and Implementation of Timer, GPIO, and 7-segment Peripherals

FPGA Laboratory Assignment 4. Due Date: 06/11/2012

EEM Digital Systems II

AD9884A Evaluation Kit Documentation

Spartan-II Development System

DE2-115/FGPA README. 1. Running the DE2-115 for basic operation. 2. The code/project files. Project Files

EECS150 - Digital Design Lecture 12 - Video Interfacing. Recap and Outline

UNIVERSITY OF TORONTO JOÃO MARCUS RAMOS BACALHAU GUSTAVO MAIA FERREIRA HEYANG WANG ECE532 FINAL DESIGN REPORT HOLE IN THE WALL

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science Introductory Digital Systems Laboratory

SWITCH: Microcontroller Touch-switch Design & Test (Part 2)

Logic Analysis Basics

Logic Analysis Basics

Design of VGA Controller using VHDL for LCD Display using FPGA

Design and Implementation of SOC VGA Controller Using Spartan-3E FPGA

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science Introductory Digital Systems Laboratory

VID_OVERLAY. Digital Video Overlay Module Rev Key Design Features. Block Diagram. Applications. Pin-out Description

StickIt! VGA Manual. How to install and use your new StickIt! VGA module

Lecture 14: Computer Peripherals

FPGA Design. Part I - Hardware Components. Thomas Lenzi

VGA Controller. Leif Andersen, Daniel Blakemore, Jon Parker University of Utah December 19, VGA Controller Components

EECS150 - Digital Design Lecture 13 - Project Description, Part 3 of? Project Overview

VGA 8-bit VGA Controller

ECE 532 Design Project Group Report. Virtual Piano

Pivoting Object Tracking System

Smart Night Light. Figure 1: The state diagram for the FSM of the ALS.

Hitachi Europe Ltd. ISSUE : app084/1.0 APPLICATION NOTE DATE : 28/04/99

Lab #5: Design Example: Keypad Scanner and Encoder - Part 1 (120 pts)

Lab # 9 VGA Controller

Laboratory Exercise 4

Experiment # 4 Counters and Logic Analyzer

Nintendo. January 21, 2004 Good Emulators I will place links to all of these emulators on the webpage. Mac OSX The latest version of RockNES

IT T35 Digital system desigm y - ii /s - iii

ECE 263 Digital Systems, Fall 2015

Lab Assignment 2 Simulation and Image Processing

TABLE 3. MIB COUNTER INPUT Register (Write Only) TABLE 4. MIB STATUS Register (Read Only)

Design and analysis of microcontroller system using AMBA- Lite bus

DT3162. Ideal Applications Machine Vision Medical Imaging/Diagnostics Scientific Imaging

EECS150 - Digital Design Lecture 10 - Interfacing. Recap and Topics

Testing Results for a Video Poker System on a Chip

Doc: page 1 of 5

C8188 C8000 1/10. digital audio modular processing system. 4 Channel AES/EBU I/O. features. block diagram. 4 balanced AES inputs

Fingerprint Verification System

Design and Implementation of Nios II-based LCD Touch Panel Application System

TV Synchronism Generation with PIC Microcontroller

8 DIGITAL SIGNAL PROCESSOR IN OPTICAL TOMOGRAPHY SYSTEM

Video Graphics Array (VGA)

Using on-chip Test Pattern Compression for Full Scan SoC Designs

DMC550 Technical Reference

Lancelot. VGA video controller for the Altera Nios II processor. V4.0. December 16th, 2005

Solutions to Embedded System Design Challenges Part II

2.6 Reset Design Strategy

Camera Controller Project Report - EDA385. Einar Vading, ael09eva Alexander Nässlander, ada09ana Carl Cristian Arlock, ada07car November 1, 2013

Display Interfaces. Display solutions from Inforce. MIPI-DSI to Parallel RGB format

VHDL Design and Implementation of FPGA Based Logic Analyzer: Work in Progress

Block Diagram. dw*3 pixin (RGB) pixin_vsync pixin_hsync pixin_val pixin_rdy. clk_a. clk_b. h_s, h_bp, h_fp, h_disp, h_line

Laboratory 4. Figure 1: Serdes Transceiver

ECE 532 Group Report: Virtual Boxing Game

TV Character Generator

LAX_x Logic Analyzer

SPI Serial Communication and Nokia 5110 LCD Screen

7inch Resistive Touch LCD User Manual

Video. Updated fir31.filtered on website Fall 2008 Lecture 12

EZwindow4K-LL TM Ultra HD Video Combiner

Section 14 Parallel Peripheral Interface (PPI)

INTERLACE CHARACTER EDITOR (ICE) Programmed by Bobby Clark. Version 1.0 for the ABBUC Software Contest 2011

Lab 3: VGA Bouncing Ball I

High Performance TFT LCD Driver ICs for Large-Size Displays

Point System (for instructor and TA use only)

Sundance Multiprocessor Technology Limited. Capture Demo For Intech Unit / Module Number: C Hong. EVP6472 Intech Demo. Abstract

IMS B007 A transputer based graphics board

AC : DIGITAL DESIGN MEETS DSP

4.3inch 480x272 Touch LCD (B) User Manual

Counter/timer 2 of the 83C552 microcontroller

Lab 6: Video Game PONG

Rensselaer Polytechnic Institute Computer Hardware Design ECSE Report. Lab Three Xilinx Richards Controller and Logic Analyzer Laboratory

ECE 532 PONG Group Report

The World Leader in High Performance Signal Processing Solutions. Section 15. Parallel Peripheral Interface (PPI)

Logic Analyzer Triggering Techniques to Capture Elusive Problems

EXOSTIV TM. Frédéric Leens, CEO

Electrical and Telecommunications Engineering Technology_TCET3122/TC520. NEW YORK CITY COLLEGE OF TECHNOLOGY The City University of New York

MUSIC TRANSCRIBER. Overall System Description. Alessandro Yamhure 11/04/2005

Overview of BDM nc. The IEEE JTAG specification is also recommended reading for those unfamiliar with JTAG. 1.2 Overview of BDM Before the intr

VGA Port. Chapter 5. Pin 5 Pin 10. Pin 1. Pin 6. Pin 11. Pin 15. DB15 VGA Connector (front view) DB15 Connector. Red (R12) Green (T12) Blue (R11)

Fast Quadrature Decode TPU Function (FQD)

Teletext Inserter Firmware. User s Manual. Contents

Professor Henry Selvaraj, PhD. November 30, CPE 302 Digital System Design. Super Project

110 MHz 256-Word Color Palette 15-, 16-, and 24-Bit True Color Power-Down RAMDAC

12.1 Inch CGA EGA VGA SVGA LCD Panel - ID #492

C8000. sync interface. External sync auto format sensing : AES, Word Clock, Video Reference

2.13inch e-paper HAT (D) User Manual

... User Guide - Revision /23/04. H Happ Controls. Copyright 2003, UltraCade Technologies UVC User Guide 1/23/2004

SEQUENTIAL LOGIC. Satish Chandra Assistant Professor Department of Physics P P N College, Kanpur

Spartan-II Development System

A MISSILE INSTRUMENTATION ENCODER

9 Analyzing Digital Sources and Cables

Lab #10 Hexadecimal-to-Seven-Segment Decoder, 4-bit Adder-Subtractor and Shift Register. Fall 2017

Transcription:

EDA385 Bomberman Fredrik Ahlberg ael09fah@student.lu.se Adam Johansson rys08ajo@student.lu.se Magnus Hultin ael08mhu@student.lu.se 2013-09-23

Abstract This report describes how a Super Nintendo Entertainment System-like system running the classic game Bomberman was developed and implemented on a Nexys 3 development board. Custom IPs were written in VHDL and the software, running on a Xilinx MicroBlaze soft processor, was written in C. Four original Nintendo gamepads was used for player input, and a speaker was added for playback of music and sound effects. The project turned out great, instructions on how to connect gamepads and a speaker is included in this report along with timing diagrams. The project took seven weeks and was done as a part of the course Design of Embedded Systems - Advanced Course taken at LTH in Lund, Sweden.

1 Introduction In this chapter the concept behind and goals of the project is explained as well as a few design ideas. An overview of the system architecture is also presented. In the following chapters detailed descriptions of the custom IPs and program code will be given along with installation instructions for anyone wanting to test the game. Lastly there will be a discussion about problems that occurred, possible improvements to the design and thoughts about how the project turned out. 2 Concept In the year 1995 the game Super Bomberman 2 was released in Europe by Hudson Soft for the Super Nintendo Entertainment System (SNES). In it, the player controlled a small bomberman that navigated a maze and could place bombs that blasted both obstructions and enemies out of the way. It contained both a single player storyline and a multiplayer component. The multiplayer component was the only focus of this project. In it up to four players battled it out in a maze-like arena collecting powerups and bombs until only one player remained. Figure 1 and 2 show a screendump of the original game alongside with a picture of the FPGA running the finished project. Figure 1: Screenshot from Super Bomberman 2 3 Architecture The original goal for the project was to implement a working Bomberman multiplayer game for up to four players without powerups or any sound. Original Nintendo gamepads were to be used, as well as original graphic elements and the game was to be displayed in 640x480 resolution @ 60 Hz using a VGA interface. The system was to be designed as software independent as possible, meaning that in theory any game could be played by running a different program. The final system achieved all this, as well as support for music and sound effects, and the software was extended to include three different power ups improving the overall feel of the game. The graphics components, music and 1

Figure 2: The final system up and running. sound effects ended up being stored on the onboard flash memory since it would be too big to store in program memory. In figure 3 a block schematic of the finished hardware is shown. All communication with the peripherals is done over the AXI bus. The custom IPs are implemented with the AXI Memory bus interface, so that they can easily be connected to any system using the AXI BRAM Controller IP, and memory mapped by the MicroBlaze. The GPU outputs the VGA RGB signals as well as an interrupt signal (IRQ). The IRQ is connected to the MicroBlaze using an AXI Interrupt Controller, meaning that more interrupts could easily be added and handled by the system if wanted (a keyboard interrupt for example). The serial flash memory is accessed using the AXI Quad SPI Interface core. This core is configured to run in a read-only mode, reducing the complexity of the data transfer and eliminating the need to read or write from any registers to set up a transfer. The flash memory is pre-loaded with the game components using Digilent Adept.The VGA timing generator was made available in a previous course and simply generates a horizontal and vertical counter used by the GPU as well as VGA synchronisation signals for the monitor. 4 Hardware 4.1 GPU The GPUs task was to render the graphical elements in the game, meaning it had to decide the color of each pixel at the exact time it was needed by the screen. It was designed to be simple to program, requiring as little information from the user as possible, but still versatile and capable of displaying a lot of graphics elements, keeping in mind that it should be able to render any SNESlike game, not only Bomberman. It also had to meet the timing constraints of the system. Figure 5 shows a slightly simplified version of the entire GPU. There are two components that make up the graphics. Tiles that are used 2

BRAMs Programs and Data CPU MicroBlaze SPI Flash LMB Bus AXI Bus IRQ Gamepad I/F VGA timing generator GPU Sprite & Tiles Sound Generator Sync RGB Existing IP Gamepads VGA monitor Custom IP Existing Hardware Figure 3: Overview of the hardware architecture for background, and sprites that are used for foreground objects. Both tiles and sprites are 16x16 pixels large, however, tiles are aligned to a grid and cannot be moved independently. Sprites on the other hand can be offset by any value from this grid. Sprites and tiles are made up by bitmaps, and a palette index. The bitmaps are made up by 4-bit color indexes, meaning that each separate sprite or tile can contain up to 16 unique colors. To color the sprite or tile the color index needs to be looked up in a palette. Each palette contains up to 16 colors, and by changing the palette index, the same sprite or tile can be rendered in different colors, and different sprites or tiles can use the same palette. This saves a lot of memory. Figure 4 shows an example of how the bitmap - palette combination works. Rendering the background is simple and deterministic and thus performed on-the-fly by cascading the tilemap and bitmap memories, addressing the tilemap with hcount and vcount from the timing generator. The sprite rendering is, to the contrary, much more complex as the work needed to render a single pixel depends on the number of active sprites and their positions on the screen. This problem is solved by rendering the sprite contents for each scan line in advance and storing it in a line buffer. The line buffer consists of a BRAM which is divided into two halves (lines) of 512 pixels each. One line is read and output to the display while the next line is rendered into the other half of the memory. The rendering is controlled by an FSM, which is clocked at 100 MHz, but synchronized with the VGA timing. When a line in the buffer have been displayed and thus its memory become outdated, the FSM starts rendering the next line: Clear the line in the line buffer by writing transparent pixels at every 3

position. Iterate through the 512 sprite slots in the Sprite Attribute Memory (SAM). Each sprite occupies two 18-bit words in the SAM. The first word contains the X and Y coordinates and an enable flag. If the current sprite is not enabled or if the currently rendered scan line does not intersect the sprite, judging from its Y coordinate, then the FSM will naturally continue with the next sprite slot. Otherwise: Read the next sprite attribute word, containing bitmap idx, palette idx, invx and invy, and store it in the FSM state. invx and invy are flags which mirrors the sprite in X resp. Y direction. Read the 16 pixels of the intersection of the sprite bitmap, and write them into the linebuffer together with the palette index, and with the transparency bit flipped. The pixel write is only committed if the color index value is not equal to the index of the transparent color ; thus sprites can partially overlap eachother. The color and palette data from the tilemap core, as well as the color, palette and transparency data from the sprite core is then fed to the palette block. The color and palette indices are selected from either tilemap or sprites using muxes, controlled by the transparency bit from the sprite core, and are then used to address the palette RAM. The resulting 18-bit word contains the red, green and blue color values as 6-bit words. Those are then used to control the video DAC on the development board. The GPU is connected to the AXI bus using a single port AXI BRAM controller, which allows direct read and write operations to any of the memories contained in the GPU. Read support was originally not considered necessary, but was added late in the development to allow sprite sorting. + + = = Figure 4: Illustration of the usage of palette and bitmap 4

Hcount IRQ Data_out CE Address RGB Vcount Interrupt GPU Addresser Palette BRAMs PALETTE Tilemap BRAMs MAP TILES CE CE Data_out Data_out CE Address & Data_in Color Index Palette Index AXI BRAM Bus Sprite Handler SAM BRAMs SPRITES LINE BUFFER Data_out Color Index Palette Index Transparency Figure 5: Overview of the GPU 4.2 Sound generator A simple way to generate audio signals is to use pulse width modulation (PWM). It works by changing the duty-cycle of a square wave. When passed through a low-pass filter (or a speaker), the square wave will be integrated into an analog signal, if the duty-cycle is increased the analog signal will rise and if decreased the analog signal will fall. Audio encoded into raw (pulse code modulated) data only contains different duty-cycle values, that when modulated correctly will generate a good sounding audio wave. Figure 6 shows a simplified schematic of the sound generator. Internally, an address is being incremented at the sample rate, reading new sample data from the dual-port BRAM to the PWM component. This address will correctly overflow when reaching the last address of the BRAM so that it will never run out of data. Data samples are continuously being fed into the sound buffer by the MicroBlaze. When one half of the buffer is being sampled by the PWM component, the other is being updated by the CPU. To know which half to update, the msb of the cyclic sample address is wired to the data out signal, available for polling at any time by the CPU. On the PWM side of the generator, the data sample is stored in a register and updated at the sample frequency. To generate the square wave an 8-bit counter is incremented at the system clock frequency. At the start of the counter the output signal is set to high. The counter value is then compared to the sample 5

data and when the values are equal the output signal is set to low. When the counter reaches 256 it overflows and the output signal is once again set to high. The sample frequency was chosen by dividing the system clock frequency with 10 times the resolution of the duty-cycle. This means that the PWM out signal will switch exactly 10 times for every audio sample, this being thought to give a clearer more even sound. 8-bit data samples were used resulting in a resolution of 256 different duty-cycle widths and a sample frequency of about 39.1 KHz. This also means that two samples can be stored at each memory location, since the BRAMs are 18 bits wide. The sound generator is connected to the host using an AXI BRAM controller. Reading from the sound controller returns a flag that indicates which buffer half is currently read by the core, to allow synchronization with software. 100 MHz Clock 100 MHz clock AXI BRAM Bus WE Addr Data_in Data_out Sound Generator BRAMs Sound Buffer 39.1 KHz clk Sample data PWM PWM_out Figure 6: The sound generator 4.3 Gamepad interface A NES gamepad was used for the input interface of the game. The NES gamepad was used back in 1985 for Nintendo s first game console. It has a four direction cross and 4 buttons (A, B, start, select as shown in Figure 7). Figure 8 shows the pin-out of the game-pad. In this application, 5 out of 7 pins are used. GND and VCC provides power to the gamepad, whereas PULSE (serial clock), LATCH (asynchronous parallel load) and DATA (shift register output) are used for communication. The gamepad is designed to be run at 5V (being the native core voltage of the NES), but when tested with an oscilloscope and a pair of signal generators it works as well at 3.3V, which is the voltage supplied by the development board. This simplified the development of the interface by avoiding both level conversion of the IO and sourcing a higher supply voltage. For I/O on the development board four Pmod connectors where used. 6

Figure 7: Button layout of a NES gamepad Figure 8: Pin-out of a NES gamepad cord The gamepad has a parallel load shift register that loads the state of the buttons when LATCH is asserted. The 8-bit data word can then be clocked out using PULSE. The data signal from input comes in the order as shown in Figure 9. The latch and pulse signal is generated from the interface I/O core and the data is then stored in registers. The current state of the gamepad is always available as a memory mapped register on the AXI bus. Four identical gamepad controller cores are instanciated in the system; one for each gamepad. Figure 9: Serialized data output, latch and pulse signal in to gamepad 7

5 Software The game logic was written in C, to be compiled using Xilinx SDK and run on a MicroBlaze soft core. The game is interfaced by two functions, init game, called at startup to initialize the game state and download the graphics to the GPU, and run game, invoked through the vblank interrupt at 60 Hz to update the game state. The run game function performs the following steps: read the gamepad inputs and store as a bitfield in memory, updates all bombs, including animation, calculating blast and detecting hit players, moving players, detecting collision agains walls and powerups, updating the scores and the time at the top of the screen, sorting all active sprites according to their Y coordinates, in order to produce an illusion of depth when players walks through bombs or other players, disabling unused sprite slots All low-level routines was implemented in a separate module, called Hardware Abstraction Layer. The HAL is responsible for configuring the interrupt controller, reading gamepads, generating pseudo-random numbers and rendering sound. The sound engine is run in the background when the vblank interrupt routine is not executing, thus implicitly prioritizing the game logic over filling the sound buffer. This is because the game logic update has to complete before the start of the active area of the next frame, to avoid visible glitches or tearing, whereas the sound buffer may be filled whenever during the video frame. Therefore the sound engine busy-waits when synchronizing to the sound hardware. The same behaviour could be implemented using a buffer-empty interrupt from the sound generator hardware and prioritized, nested interrupts in the MicroBlaze, but this was considered an overly complex solution for this application. The primary gain using such a solution would be lower power consumption, as the MicroBlaze could be halted while waiting for an interrupt. The sound is rendered by simply mixing the PCM streams of the music and any active effects and writing it to the ring buffer in the sound generator. The streams are read directly from the SPI Flash, which is conveniently memory mapped using the execute-in-place (XIP) feature in the memory controller. A large amount of non-volatile memory could have been saved if the music was instead generated using a sequencer and/or synth in software on the MicroBlaze, but this was deemed out of scope for this project. To parallelize the workload during development, an emulator framework was written in C using libsdl which allowed the game logic to be compiled and tested on a PC. The emulator code replaces the HAL code used when targetting the real hardware. Even though this kind of emulation in general only applies at source code level, this emulator was written to make use of the same data structures as in the real sprite handler and tilemap memories. 8

Graphics data was generated using a purpose-build python application called pixl8, shown in figure 10. This allowed interactive design of bitmaps and palettes and generated C header files to be used directly in the application code. Figure 10: pixl8, a bitmap and palette editor 6 Installation The VGA cable is connected to a monitor and the game-pad is connected to four Pmod connectors on the board. Where pin 12(VCC), pin 11(GND), pin 10(CUP), pin 4(OUT0) and pin 3(D1). Download the download.bit file using Digilent Adept tool. The controls of the game are as follow. A is used to place bombs and the direction cross to move the bomberman characters. 7 Problems and conclusions The project was largely successful with the GPU and gamepad interface implemented as planned, as well as the sound generator, enabling music and sound effect, even though this was considered an optional bonus. The software was developed in time and performed good enough. There were some problems with implementing and debugging the AXI bus interface of the custom IP cores, but those were resolved and did not affect the project. Some software debugging was performed on the MicroBlaze using XMD. The overall development was straight-forward and successful, as a result of a well-designed architecture. 8 Contributions Xilinx software: Adam, Fredrik 9

GPU: Adam, Fredrik, Magnus Gamepad I/F: Magnus Sound Generator: Adam, Magnus Game logic: Mostly Fredrik, Magnus did the powerups part Tools: Fredrik 9 References 1. NES Controller http://www.mit.edu/~ tarvizo/nes-controller.html 2. Digilent Adept http://www.digilentinc.com/products/detail.cfm?prod=adept2 10