Lab 2: Hardware/Software Co-design with the Wimp51 CpE 214: Digital Engineering Lab II Last revised: February 26, 2013 (CAC) Hardware software co-design, now standard in industry, is an approach that brings hardware and software together quickly and often. Instead of working separately, hardware and software teams work together throughout the design process. Computer models of both hardware and software are built early and are simulated together at regular intervals. Problems are identified quickly and are eliminated before the design progresses too far. This process allows many products to be brought to market faster and at lower cost than with older approaches. This lab will be your first introduction to both hardware software co-design and to programming in assembly language (ASM) for microcontrollers. You will write a multiplication program and will prototype it on a microprocessor model based on the 8051 developed here at Rolla, the Wimp51, to eliminate any potential problems early in the design process. You will be able to observe how the Wimp51 handles your code instructions at each clock cycle. 0.1 Outline and Concepts 1. Design a 7-segment decoder, top-level circuit in Quartus II 2. Introduction to the Wimp51 3. Writing your multiplication program in ASM 4. Simulation with ModelSim 5. Testing with Cyclone II FPGA 1 Design a 7-segment decoder, top-level circuit in Quartus II In the Wimp51, each databus is 8-bits wide. Therefore the value can be represented by two hex values (called nibbles). We are going to use a 7-segment decoders to view the processor s datapaths as your code runs on the FPGA s 7-segment displays. First off, it s possible you may already have a 7-segment decoder already that you can directly copy into your project folder, and then import into the project. Depending on your 112 instructor, you may have manually made the design as a standard block diagram file (.bdf), or they may have provided you with an.edf file that is just a code version. Either are fine. If you can t locate it, don t worry, you can skip working out all of the Kmap s by simply following my Appendix B example. 1. Open Altera Quartus-II, and create a new project called Lab2. Create a block diagram to for the seven segment decoder. You may find the design in Appendix B. The name of this entity should be ss_decoder. Save a symbol for this entity. 2. Create another block diagram file for the top level entity called Lab2. In the lab2 folder on Blackboard, copy the following files to your lab 2 folder and add them into your project: i8051_lib.vhd, wimp51_rom.vhd, wimp51.edf, WIMP51_ROM.bsf, Lab2_pins.csv. 3. Create a symbol for the wimp51.edf file. Proceed to wire up the circuit as shown in Figure 1. 4. Compile the project. After a successful compilation, we can proceed to work on the software. 1
FIG. 1: Top-level schematic for design 2
2 Introduction to the Wimp51 The Wimp51 was developed here at Rolla as a simplified model for students to understand how a microprocessor works at the most basic level. You can consider it a "chopped" down 8051, in that the Wimp51 only has at its disposal about a dozen instructions. This means that a Wimp51-developed assembly program will run just fine on an 8051, but not necessarily the other way around. The Wimp51 block diagram is shown below in Figure 2: FIG. 2: Wimp51 block diagram The instruction set is given in the following table: MOV A, #D 01110100 dddddddd A<=D ADDC A, #D 00110100 dddddddd C,A<=A+D+C MOV Rn, A 11111nnn Rn<=A MOV A, Rn 11101nnn A<=Rn ADDC A, Rn 00111nnn C,A<=A+Rn+C ORL A, Rn 01001nnn A<=A OR Rn ANL A, Rn 01011nnn A<=A AND Rn XRL A, Rn 01101nnn A<=A XOR Rn SWAP A 11000100 A<=A(3..0) A(7..4) CLR C 11000011 C<=0 SETB C 11010011 C<=1 SJMP rel 10000000 aaaaaaaa PC<=PC+rel+2 JZ rel 01100000 aaaaaaaa PC<=PC+rel+2 if Z (A=0) If we were to use a logic analyzer during a program on the 8-bit databuses on the Wimp51 diagram above, you would be able to see something like in Figure 3: 3
FIG. 3: Simulation of Wimp51 data buses Soon you will grow to appreciate the perceived underpoweredness of Intel s original 8051 reference design when you have to start writing programs with this few of instructions. 3 Writing your multiplication program in ASM Of course, multiplication can be generalized as just a sequential series of additions. However, if the numbers you are multipliying get quite large, then it is obviously very ineffi cient to perform in this manner. For your program, you will need to write a generalized multiplication routine for 3x15 that utilizes looping. This is more tricky than you would think, since you don t have certain conditional jumping instructions at your disposal. A skeleton of how to get the program started is as follows: PC 0000h MOV A, #03h 0002h MOV R1, A 0003h LOOP: MOV A, #0Fh ADDC A, #0FFh STOP: SJMP STOP 4
It is your job to work out the rest of the instructions in-between. Once you have the instructions, you are going to need to convert them into their binary equivalent from the instruction set table provided. You will need to work out the Program Counter (PC) addresses for each instruction in code memory to get the conditional jumping locations correct. I can help with this and it will be convered in detail in the 213 lecture if you aren t familar with it. Once done, you can go ahead and follow these steps: 1. Modify wimp51_rom.vhd to contain your binary instructions for your program (03H by 0FH = 2DH). This will serve as the code definition for the I8051_ROM symbol. Each byte needs to be in quotes, i.e. "00110100" and separated by comma between each byte. Also be sure to set the length of your instruction array, you will see SIZE in the code definitions, so you can change it there. VHDL counts 0 (like C) as a position, so make it 1 less than the actual number of instructions you have. 2. Once done, save and then try to compile your project. If successful, then you can move onto simulating the design. 4 Simulation with ModelSim Copy the VHDL files needed for simulation in ModelSim (provided on Blackboard) along with your wimp51_rom.vhd to a new folder (a subfolder of your lab2 is good). Using ModelSim to create a new project with these files, observe these waveforms as in Figure 4: clk, rst, addr, data, acc. First you will need to assign the clock period at 100ps. Then, type into the command line: force -freeze /lab2/rst 1 0 force -freeze /lab2/rst 0 50ps run 100ns where /lab2/ is the name of your project FIG. 4: ModelSim example results of program 5
5 Testing with Cyclone II FPGA Testing your design on the FPGA is not any more diffi cult than using ModelSim, but probably more satisfying because you can easily step clock by clock through your code and easily read out the individual bytes on the 7-segment displays. 1. To assign the pins, FIRST make sure that your pin names match Figure 1 and are as follows: R0_0, R0_1, data0, data1, addr0, addr1, acc0, acc1 (where 0/1 are lower/upper nibble respectively). Then on the top menu, click Assignments -> Import assignments and choose your Lab2_pins.csv to import the correct pin assignments. We will use all 8 sevensegment displays available on the board to observe the lines. From left to right on the board, here is what will be displayed: R0 1 R0 0 data 1 data 0 addr 1 addr 0 acc 1 acc 0 2. Compile the project again with the pin assignments. Then download the design on the DE-2 FPGA board using the programmer in Quartus II. 3. Observe the seven-segment displays while controlling the clock using the push button. Since the Wimp51 is very basic, there are 3 clock cycles required to completely process an instruction (one each for Fetch, Decode, Execute) before moving on to the next. Observe the changes in the address, data, accumulator and the contents of R0 at every clock edge, and note the observations. Also note the different behavior for instructions of different sizes. Find the number of clocks needed for completion of one instruction. Also find the total clock cycles to get the final result of the multiplication. What if it doesn t seem to be working, or the program seems to not be progressing? Most likely one of two things are to blame: 1) your underlying ASM program is incorrect. 2) your ASM to binary translation is wrong, and more specifically your jump conditional instructions. Also, in the past, certain FPGA boards have been more finnicky with this lab. If you are correctly giving 3 clock pulses per instruction, and still have trouble, try switching to another board. 6 Questions (attach at end of your report) 1. What are the different sizes of instructions that you implemented using the Wimp51? 2. How many instructions did your code have? 3. How many clocks were needed to get the final multiplication result? (Hint: each instruction takes 3 clock cycles...fetch, Decode, Execute) 4. Describe the different pipeline stages in the Wimp51 (referring to the diagram provided). 6