This project will work with two different areas in digital signal processing: Image Processing Sound Processing

Title of Project: Shape Controlled DJ Team members: Eric Biesbrock, Daniel Cheng, Jinkyu Lee, Irene Zhu I. Introduction and overview of project Our project aims to combine image and sound processing into one interactive product called the Shape Controlled DJ. A camera captures different shapes or objects and based on their orientation and relation to each other, different effects (e.g. volume change, pitch shift, etc.) will be applied to the music. Music is more than sound; it is a form of selfexpression. This product turns music into an interactive experience by combining visual and physical cues and incorporating them into the music. The Shape Controlled DJ provides an intuitive and unique way to control and create music. Professional DJs could use this extremely portable system to experiment with sounds and remix tracks on the fly while people with any background could use the system for entertainment and social purposes. This project will work with two different areas in digital signal processing: Image Processing Sound Processing This project requires sampling frames of a video camera, which is capturing different shapes. From these samples, we need to apply color filtering to differentiate the shapes from their background. Next, we must determine the location, the orientation of the objects, and if and how much they changed orientation and location since the last frame. The other part of this project requires extracting the frequency components from a preloaded audio track, manipulating the spectrum to create different effects, and outputting this new signal to the speakers. To create different effects, we will use the following methods: Volume Change: Scaling signal amplitude Pitch Shift: Changing length of sound, then performing a sample rate conversion [3] Flanging: Mixing the signal with a small gradually changing delay with itself [2] Reverberation: Mixing many delayed repetitions very close together [1] Ring Modulation: Multiplying the signal by a simple sine wave [4] Chorus: similar to flanging, with a longer delay [2] 1

Not all the effects may be implemented, but we want to have at least 3 working effects (volume, pitch, flanging). We want to be able to output the mixed signal to speakers at a 192 kbps rate and have stereo sound. We will use the Altera DE2-70 FPGA board and the TI C5515 DSP stick. The image processing will be done on the Altera board while the sound processing will be done on the DSP stick. We will also need access to a camera, a monitor, and speakers. All of these devices are available in the 452 lab. At the CoE Design Expo, we hope to show a product that tracks 2 shapes with the camera and changes the volume, shifts the pitch, and adds flanging to the original audio track based on the location and orientation of the shapes. In addition to our design poster, we will need a camera, monitor, and a set of speakers to allow expo attendees to mix a preloaded music track themselves with the shapes provided. II. Description of project i. The goal of the project is to create a system that takes in video input of the shapes and audio input of a track and outputs the track with modifications based on the location and angles of the shapes picked up by the camera. The camera will capture 24 frames per second with a resolution of 640 by 480. Using these frames, we will filter out the background and do edge detection on the shapes. The edges will give us the centroid and orientation of the shapes as well as the distance between the shapes. We will compare the orientation of the shapes and their distances apart from frame to frame to find determine the parameters. These parameters will be sent to the DSP stick to be used to change the sound of the input audio track. An algorithm will modify the frequency spectrum of the track to add certain effects (e.g volume change, pitch shift, flanging, etc.) based on these parameters. The track will then be outputted from the DSP stick to a set of speakers. The system is feasible if we can interface the two devices and implement filters to process the frames and the algorithms to get and use the parameters to modify the track. ii. The system (shown in a figure on the next page) gathers input from the shapes and their orientation using a video camera. The video camera is connected to the FPGA via the video in. The FPGA will display the input from the camera onto a monitor using a VGA cord for tracking/testing purposes as well as for display at the Design Expo. The 2

FPGA will also filter the background out to help find the orientation of the shapes and the distance between them. These are the parameters for the sound modification algorithm. The line out will be connected to the audio in of the DSP stick. The DSP stick will receive the parameters through the audio in port and the track to be remixed will be on an SD card inserted in the SD card slot on the back of the DSP stick. The DSP will modify the frequency spectrum of the audio track using the parameters from the FPGA and help from an algorithm; the modified track will be outputted to the speakers using the audio out port on the DSP stick to connect to speakers. iii. The most crucial component to the success of our system will be the interface between the image processing and sound processing devices as it is essential to our project s success. After the interface, the next two important components are the image and sound processing units separately. If 24 frames per second cannot achieved in the image processing component, we will have to lower the frame rate to something we can use, find less parameters from the frames, or develop ways to get multiple parameters from a single measurement. If the sound processing has too much delay during while playing the track, we will have to hold back on releasing the output to counter the delay or lower the bitrate of the input and output tracks. iv. Table 1 list the resources that we will need to complete our project and their availability in 452. Table 1. Parts list and Prices Component Name and Model # Use Available in lab? Cost? Camera BU TP 1004DN Capturing visual input of shapes Yes $0 Monitor N/A Displaying real-time movement of shapes Yes $0 FPGA Altera DE2-70 FPGA Processing image data Yes $0 DSP stick TI C5515 DSK Processing sound data Yes $0 Memory for $10-12 Holding Tracks SanDisk 4GB mircosdhc Card Holding audio track that can be remixed No [5] Speakers N/A Playing audio of remixed track Yes $0 3

III. Milestones a. Milestone 1 (achievable by March 15) By March 15, we need to have a proof of concept in MATLAB done to prove the project is feasible with hardware and begin testing image and sound modules independent of each other with hardware. The two major parts of the project are being able to identify shape parameters (orientation angle, shape type, relative distance) and using these parameters to adjust the sound characteristics (volume, pitch, flanging) of an audio track. These two parts must be working independently by Milestone 1. Specifically, the imaging part of the project must be able to take a video image from the camera, shape parameters, and sending the values to the TI C5515 DSK using MATLAB. The sound processing part of the project must be able process input audio, modify the track based on the changing parameters from the FPGA, and drive the modified output to the speakers. We will demonstrate that the imaging module is able to feed the parameters to the sound processing module which produces the modified sound, both in real time. We will translate the MATLAB code to Verilog and C for implementation into the DE2-70 FPGA and TI C5515 DSK respectively. b. Milestone 2 (achieved by April 13) By April 13, both the imaging and sound processing modules must be interfaced together in hardware. The imaging mode should be able to output the video from the camera to a VGA monitor for easier testing. If time allows, it would also display angles and draw distance lines between shapes for testing and demonstration purposes. The parameters that affect the sound module should also be able to be feed in manually from the switches in the FPGA board for easier testing. If time allows, the parameters should also be displayed on the VGA monitor. After each part have been tested and proven to be functional, we would link the two modules to complete the overall system. c. One major potential problem in achieving Milestone 1 is the real time image processing. Since no one in our group has experience in image processing, substantial amount of research must be done to implement our desired function. The major concern about achieving Milestone 2 is the interfacing of the two modules. Getting the two modules to be linked in real time in hardware would be much harder than the software implementation in MATLAB and is the backbone of our project. 4

IV. Contributions of each member of team Table 2 lists the contributions of each member. We noticed that our project has two main signals processing components and assigned two members to each part. Table 2. Team Member Contributions Image Processing Sound Processing Daniel Cheng Jinkyu Lee Irene Zhu Eric Biesbrock Expertise - Filtering - Algorithm, code -Algorithm, code - Sound processing Assigned Contribution - Image process filtering - Get the input from camera, filtering the shapes from background - Edge detection -Developing algorithm for distance and angle calculation of processed image - Calculated parameters will be used by sound processing module -Developing sound effect algorithms that modify the volume, pitch, and add flanging to input audio - Parameters sent by the image processing module will be used - Sound processing component - Building filters that can get the input audio into the TI C5515 DSK stick - Output the audio after correctly modifying the track This is a general break up of our tasks, if there are any setbacks in one module all members are expected to help get the module back on schedule. The team meets weekly on Tuesday at 12:30, with the exception of spring break. Each member will be working independently on his or her individual parts; the meetings serve as a time to discuss progress or issues of each part. Meeting minutes are taken at each meeting and then emailed out to all members afterwards. As the project progresses, we may need to hold meeting more often, however we will always meet on Tuesdays. In addition, if a situation arises that needs immediate attention, all team members will be contacted for an emergency meeting. We will be using a combination of Ctools, Google docs, and e-mail in addition to face to face meetings. We have already developed a Ctools site; this is used mainly to share relevant research papers, websites, and written documents (e.g. the proposal drafts, MATLAB code, etc). Google docs will be used so that members can work on a document simultaneously. For instance, our presentation slides will be uploaded to Google docs and each member will add slides for their own speaking parts. E-mail will be our main source of 5

communication outside of meetings. They will serve as quick correspondences or to initiate a meeting where we can further discuss issues. We have also created a gantt chart that will be updated weekly to reflect our progress and to keep each other informed of what our goals are. Table 3 is our gantt chart. Table 3. Gantt Chart 6

V. References and citations [1] http://users.utcluj.ro/~atn/papers/atn_4_2008_1.pdf [2] http://denniscronin.net/dsp/article.html [3] http://www.dspdimension.com/admin/time-pitch-overview/ [4] http://ezinearticles.com/?audio-effects---compression-and-ring-modulation&id=310986 [5] http://www.radioshack.com/product/index.jsp?productid=2950499 7