Picking up where I left off:
Time Analysis:
To pick up from where I left off will require a few steps to get started. I’ll try to lay things out and make it as easy as possible. This project is about analyzing C programs as they are running on the ARM processor.
You will start with a C program (a benchmark) and run that through a program that compiles and simulates ARM architecture. This program is called uVision (micro vision). This program will give you two things: the time and the assembly code needed for our project. I wasn’t able to figure out a good way to produce an assembly file from uVision, so I would copy and paste it from the debugger screen when in disassembly mode. I recommend looking at Zach’s video to get started with uVision.
Here are some other things that I found out about uVision. Originally we thought we needed to change xtal MHz from 12.0 to 60.0 (this is when you go into options for Target 1), but we determined that it should stay at 12.0 later this year. Make sure you comment out all printf statements in the benchmark. When you are in the debugger mode and trying to get the assembly code, go to the disassembly view, right click and select Assembly Mode instead of Mixed Mode. One thing that is convenient in the debugger mode is you can right click in the disassembly mode and go to execution profiling, you can enable either show time or show calls. You will find both of these to be useful. I have set up a lot of the uVision files for you. You will find these in a folder called uVision Projects in the zipped folder you should receive from Dr. Healy.
After you copy the assembly code from uVision, you should paste it in a notepad document and save it as a .s file. You will then need to run it through a Java program I wrote called converter. This will produce a .ss file that you can then run through RALPHO. The converter program simply changes the format of the data to what RALPHO is expecting. There was not too much focus on this small program, but there are a few quirks I implemented into it. You will find these in my journal.
Once you have the .ss file, it is time to run it through RALPHO. RALPHO is a python program that is named ralpho_arm.py. To my knowledge, RALHPO should work well. The purpose of RALHPO is to turn the .ss file into a .inf file. RALPHO requires an ADF file. This stands for Assembly Definition File. The purpose of this file is to provide information specific to the ARM architecture. It contains information about opcodes and mnemonics. The .adf file we are using is called armv4a.adf.
Once you get your .inf file, it is time to move to the timing analyzer itself. The first thing you need to do is create a .ist file. You can do this by running the program inf2ist. Once you do this, you can run time.bin –iv (-v if you want verbose output, which I recommend when debugging) [filepath]. This will give you outputs. I know it seems like a lot of steps to get here, but once you do it a few times, you’ll realize it’s not that bad. The hardest part is figuring out where the errors come from and what you should tweak.
To edit RALHPO I used a program called Aptana Studio and to edit converter.java I used Eclipse. Other than that, I edited everything with the timing analyzer on Linux using emacs. I always ran things through Linux as well because I found it was more convenient.
When you unzip my file, you will see a 2012 and 2013 file. 2012 is everything I was given. A lot of it is repeated in 2013, but most of it is updated in 2013. This is the layout for how I would run a file through the timing analyzer from start to finish.
- Start with the benchmark (we’ll say simpleArray.c)
- Create a uVision project (follow Zach’s instructions)
- Copy paste the assembly file into a text document (simpleArray.s)
- Using filezilla or WinSCP upload this to the file /2013/ralpho/
- Run it through the converter (java converter simpleArray.s)
- Run the new .ss file through RALPHO (python2 simpleArray.ss)
- Move the inf_file to the folder of inf files in time (mv simpleArray.inf ../time/inf_files/)
- Go to the time directory (cd ../time)
- Run the inf file through inf2ist (inf2ist inf_files/simpleArray) [do not use .inf at the end of the file]
- Run the program through the timing analyzer (time.bin –iv inf_files/simpleArray)
- If you want verbose output, (time.bin –iv –v inf_files/simpleArraysimpleArray)
My Journal:
May 28, 2013
Today was day one of my research. I began the morning by reading a detailed synapsis of what Zach did last year with the project. I then went to the library and checked out two books about programming in C. After getting my account up and running on a different server, I spent the afternoon exploring the different features of Linux and editing programs written in C. At one point, I could not yet access my new server so I was using the old one. Once I gained access, I successfully transferred all of my files from the old server to the new one. Another success I had today was downloading files from Dr. Healy’s website using a Linux command called wget. So far today has been about getting acquainted in my new workspace, becoming more familiarwith a different operating system,and becoming more proficient in programming in C. Tomorrow Dr. Healy and I will further explore the actual project at hand.
May 29, 2013
Today started the research of the actual project! This morning I finished up reading about programming in C. I guess I’ve mastered the language in two days, but I’ll keep my reference books just in case. Jumping into the middle of this project has been kind of confusing, but I’m finally getting a handle on what’s going on. The basis of what I’m working on now is figuring out how Zach did a few things so that I can do them myself and also to pick up from where he left off.
The first task of today was figuring out how to create assembly files (.s) from a C program. I think that if I use a program on Zach’s computer (uVision), it gives me the information which I can then copy and paste into an editor to be an assembly file. It would still be nice if the program created one for me. I’ll keep playing with it!
I figured out what an ADF file is, and though we found many in the zipped file from last year, we decided to use armv49.adf for now. This file has necessary information about the hardware for RALPHO to run. I should also make sure this is a complete and correct file. My next task is to test what I’ve done so far. I need to write some simple C programs and run them through the compiler to get an assembly file and then run that through RALPHO. Through time, I should be able to troubleshoot the program’s bugs. However, before this is possible, I need to thoroughly understand what the output of RALPHO is telling me. It seems possible that a lot of the code at the beginning of the output is just startup jargon, stuff that isn’t important to my programs. It also seems as though there might be the option in RALPHO to make .xml and .inf files, maybe both? While I’m debugging RALPHO, I need to make sure it is complete and see what all the features are that it currently has. Hopefully in about a week RALPHO will be able to identify the following in a C program: loops, number of loop iterations, blocks, instructions, etc. Once I finish with RALPHO and understand the output files, I will get to move on to the time analyzer, which will input these files.
May 30, 2013
Today I made some really great progress. I feel like I might be pretty close to where they left off last year! I started out this morning by looking up and confirming that it’s possible to put uVision onto my computer. However, right now we don’t need that! We were compiling through this program to get the assembly code, however we figured out today that we can compile through the raspberry pi machines and it will produce an assembly document for us. This is much easier than what we were doing! Dr. Healy also took a good amount of time to explain assembly language to me today. With this advance, I got to move on and start using RALPHO!
I started out by writing some very simple programs that just used one loop and an array. With these programs, I was able to understand the assembly code that I produced and I could follow RALPHO’s output. RALPHO seems to be doing well. I did find the error that Zach admitted was there. When there is a function that starts with ‘r,’ RALPHO thinks this is a register and disregards it as a function. This is kind of problematic. Tomorrow I think we’re going to focus more on how to solve this problem. There is also one other detail we are going to focus on tomorrow which is looking at the macros to see if it correctly handles these or not. It does produce some sort of error message when running RALPHO but it also produces an INF file. We will see what’s going on with that tomorrow!
May 31, 2013
Today I started debugging. This is always tough. I don’t feel like I accomplished as much today, but that’s because we were looking for the wrong answers. I spent a large portion of the morning looking for opcodes that we have come to the conclusion don’t exist. As of now, we don’t think we will be able to analyze C programs that use floating point. We need to stick with integers. The compiler on the simulator uses a function for floating point calculations. We might have to implement this in the future. I have also spent the day trying to figure out why some of the instructions in the assembly file don’t transfer over into the inf file that RALPHO produces. There’s still something kind of funny about what is happening with RALPHO printing invalid instructions all the time. Next week will be spent further debugging RALPHO and determining which bugs really do exist and which ones need to be fixed.
June 3, 2013
Today I finally got all of the needed programs on my computer. Zach helped me move uvision from his computer to my computer and I also downloaded Aptana studios which is supposed to help me with debugging Python programs. After figuring out that using gcc –S was not producing the assembly code we wanted, we decided to we needed to figure out a way to use the assembly code from uvision. Today I wrote a small program called converter to read the input of uvision assembly code and output a file so that it is formatted similarly to gcc.
June 4, 2013
This morning I finished up the converter program I started writing yesterday. I spent a large portion of today examining RALPHO and trying to understand why some of the instructions were omitted in the INF file produced. I found that uvision formatted one of the instructions differently and I am now trying to figure out how to accommodate for this. When uvision produces the assembly code for and add function, it uses four different operands instead of three. We had to go into the ADF file and add commas to the operands beside the opcodes to accommodate for this, however, that of course caused some new issues. One specific problem was encountered near the end RALPHO where we had a tokenizer. By adding the extra comma, it was producing a new token, but the tokenizer wasn’t acting accordingly. We solved this was by counting from the right of the array instead of from the left. This was a cool thing I didn’t know about that you can do in python. I think there are still a few more issues with RALHPO but I will focus on those tomorrow.
June 5, 2013
Today was a productive day. I got several programs to run through RALPHO. As of now, I’m pretty confident in MOST of RALPHO. There are still a couple of errors to be fixed, but hopefully they are small. Today’s big progress was fixing yesterday’s issue. In more detail here is what happened. When uvision compiles a program, some instructions are printed differently than when gcc compiled these programs. We sort of knew this and that’s why I wrote the converter program, but today we found another specific. I’m going to talk about the instruction ‘add’ though this happened with other instructions as well. With the add instruction, the ADF file was expecting 3 registries with another instruction explaining what to do (example, LSL). The first thing I had to solve was adding commas in the ADF file so that it would tokenize all operands (for example: r,r,r,LSL#0). After doing that, I had to rewrite a small portion of the code in RALPHO to read in all of the tokens, not just a set amount. There were still some instructions that didn’t get recognized, so after examining the hexadecimal code we came to this conclusion. If the hexadecimal digit is a 0 or a 1, and there are two or three registries given, we must have a shift amount for the instruction. If one is not given, then we are going to default it by doing a left shift 0. So, we need to add LSL#0 to the operands. This is accomplished in the java program converter. There’s one exception that I’ve come across so far, and that is for the compare. Though the compiler produces the mnemonic cmp, by examining the hexadecimal digits that accompany the mnemonic, the compiler actually uses the opcode that corresponds to the mnemonic “cmps” and cmps only wants two registers. This makes sense because you would only be comparing two things. The issues that I know still exist in RALPHO are, for one, it wants the main method to be first, and secondly, we still have the r mystery (it doesn’t pay attention to functions that start with the letter “r”).
June 6, 2013
I think I’m getting closer and closer to being finished with RALPHO. But, I’m certainly not there yet. Today I figured out a nice little bug that I need to fix. There are condition codes that RALPHO tries to take out of the instruction codes that it reads. Sometimes, however, there are also instructions that have these two letters included, and when RALPHO takes out what it thinks is a condition code, it affects the actual mnemonic of a real instruction. I was hoping that I could just comment this code out, but I did run into a file that used the condition codes so I have to go back and add some conditional statements to the code. I also made some progress on the missing block problem. I rewrote some code to correctly add function names into an array trying to store the function names. The problem was it was trying to store what it thought was a function name but it also included the memory address. So, when it would compare a label name to the function names, it wouldn’t be an exact match. (Ex: Function name = is_symmetric(0x0000024C) label name= is_symmetricnew_function_name=is_symmetric) This should allow ‘main’ to be anywhere in the program. However, bubblesort is not showing all function names right now and Fresnel2 is missing some blocks. So I will have to look into that mystery tomorrow. I also got some errors from uVision when I tried to build the targets, so I will have to look into those tomorrow as well (L6406E and L6407E). Dr. Healy won’t be here tomorrow, but I have a lot of little things to work on. Hopefully when he returns I’ll have some good news and positive things to show him!
June 7, 2013
I was successful at resolving two of the RALPHO issues today! Though I made an attempt at a third, there’s going to have to be more work done on that one. Here’s what I did accomplish. When I was running the benchmark bubblesort through RALPHO, it was not acknowledging that bubblesort was a function. The reason it was doing this is because when it read in the line “blBubbleSort(0x0000028C)” the instruction was to replace all instances of the instruction (bl) with an empty string and then remove white space. When this was done, the resulting string was “BubeSort(0x0000028C)”. So my fix for this was to have it split into a list at the space. BubbleSort is now recognized as a function. I don’t know how many times we would have seen this, but it should be fixed. The other big accomplishment today was I fixed the “r” problem! The problem with this was it was looking at the operands and assuming that if the first letter was an ‘r’ then it was a registry and not a label. So, to solve this, I put a conditional in there that took the length of the operands after splitting it every time there’s a comma. If the length is only 1, then it must be a label or a bx statement with a registry value. This seems to have solved that problem. The problem that still resides goes back to the conditional codes. I went through the ADF file and found all instruction codes that would be affected if a conditional code was taken out of it and told RALHPO not to take letters out of those. However, there are a few instances where an instruction name appears to have 3 letters of conditional codes. I will have to talk to Dr. Healy about how to solve this on Monday. The other problem that is left is with Fresnel2. I didn’t get a chance to look at this today, but I know the INF file does not start with block one. I will investigate that problem more on Monday as well. I looked up the errors today that occur in uVision. I’m not exactly sure what they mean, but I know it has something to do with running out of memory. All in all, this has been a very productive week. I predict we will be finished debugging RALPHO very soon.