Chapter 1
Introduction
Psychologists are strange people. We teach them how to do all sorts of wondrous things, and they don't pay any attention. They behave just like real people. But perhaps there's hope. Colin and Jennifer met in graduate school, and decided that marriage was much too complicated. So they just lived together. But that had its own complications, and about the time that they were in their late 30's and had secure faculty appointments at a good university, they finally decided to get married. They even decided that having children would be nice, as long as they could find someone to look after the children, cook their meals, wash their clothes, take them for walks in the park and later to soccer practice, and do it all for $6 per hour with no benefits.
But then came thoughts of all the problems that children bring. The kids might not be very smart. They might get into fights with the kids next door, and spark a lawsuit. They might want to go to college at a school where Colin and Jennifer don't get a tuition waiver. So our protagonists decided that they ought to do a little research on this kid thing before they got themselves into what a former U. S. president called "deep doodoo." Being trained as psychologists, they knew that there must be data available that speak to their problem, and they went hunting for it.
A quick search of the web produced some data on newborn infants. Gary McClelland, at the University of Colorado, has a collection of Apgar scores for 60 children, along with characteristics of the child's mother. (The data are available as a text file at They are also reproduced at the back of this supplement, and are available as an SPSS system file at apgar.sav.) An Apgar score is a measure of neonatal development. You simply rate a newborn infant as 0, 1, or 2 on each of 5 dimensions (heart rate, breathing effort, muscle tone, reflex initiability, and color), and then sum those scores, giving an Apgar score of 0 - 10, where 10 is best.
The Apgar data file also contains information on the sex of the child, whether or not the mother smokes, how much weight the mother gained during her pregnancy, the gestational age of the child, the degree of prenatal care the mother received, and the family's annual income. Thus these data provide the opportunity to examine the relationship between several important variables and the health of the newborn child.
Chapter 2
The Purpose of the Supplement
This supplement is intended to do two things. The first is to illustrate the use of SPSS-Student Edition, and the second is to base that presentation around an additional example. I have chosen the example of our friends Jennifer and Colin because, although they are fictitious, the data speak to some real issues that confront people. What do psychologists know about neonatal development that can guide our behavior? I have chosen to use SPSS because it is one of the most popular statistical packages available, and because it will do all of the analyses we need. In fact, its ability to do so many different kinds of analyses will help us to discover things in the data that we might not discover if we were working with simple pencil and paper.
SPSS is a very powerful statistical analysis package. Just about everything we would want to do with these data can be done by the use of simple pull-down menus. Once you become familiar with the menu structure, you can pretty much figure out how to do whatever you need, including data transformations, graphing, and statistical analyses.
The data sets that I have chosen are data that are readily available on the Internet. My goals in searching for data included finding data that would be of interest to readers, that contained a reasonable number of cases, and that fit together to allow me to "tell a story." A common approach is to select several different data sets to address different kinds of problems. I chose, instead, to select data that focused on the same general problem, and the problem that I chose was prenatal development. Psychologists, pediatricians, and epidemiologists actually know quite a bit about the influence of various maternal behaviors on subsequent development of the embryo. I will use two data sets in this supplement.
Chapter 3
An Introduction of SPSS
This chapter is intended as an introduction to SPSS. We will see how to read or enter data, how to provide labels for our variables, how to specify the nature of our variables and how they will be presented, and how to save them is a useable format. The specifics of using SPSS to graph data and to run statistical analyses will be covered in subsequent chapters when needed.
There are so many features of SPSS that I cannot even attempt to cover them all. What I will present here will get you off to a solid start, and the rest you can learn on your own. The nice thing about computer software, especially when it is menu driven, is that you can experiment. As long as you save your data, you really can't do any harm. If you want to find out how something works, click on it and see what happens. The worst that can happen is that you will have to reload your data and start again, and that is hardly the end of the world. You should also remember that you have a manual that came with your software, and that there is a help menu available. When all else fails, you can give up and look things up in the manual. But most people don't read manuals. (In fact, most manuals don't seem to be written to be read.) So play around first, and then go to the manual. You'll learn more that way.
Getting started
I will begin with the assumption that you have a copy of SPSS loaded on the computer that you are using. If you have trouble installing the software, your instructor will be able to assist you. What follows is written specifically for people running the student version, but will apply equally well to the complete version or to the graduate package.
To open SPSS you either double click on the icon on your screen if there is one. If not you will find it listed on the Start menu (probably under Programs) and can open it from there. Depending on how your copy is configured, it may come up with a standard spreadsheet, or it will ask you what you want to do. If the later, indicate that you want to create a new data file.
If you are one of those people like me who can spend a lot of time getting just the right configuration, you'll be in heaven with the preferences windows. You can set anything you can imagine, and some that you may regret having set. If you like playing when you should be working, these preferences are for you. On the other hand, if you just want to start up a piece of software and get to work, you can ignore the preferences entirely. It may mean that your printout (and even your dialog boxes) will look slightly different from mine, but that should not lead to even little problems.
Entering data:
There are several ways to enter data into SPSS, and we'll cover those most common ones. You can start with a blank spreadsheet and type in the data. We'll do that first. Alternatively, if you have the raw data in a text file, also known as an ASCII file or a dat file, you can tell SPSS to read those raw data into the spreadsheet. Finally, if you or someone else has entered the data into SPSS and saved it as a system file (usually with the .sav extension), you can simply open that file. It is also possible read data from an Excel spreadsheet or other kinds of file formats, but we will skip that route. You should be able to figure it out on your own. (Hint: just click on the file/open menus and select the appropriate type of format.)
Entering from the keyboard:
We will start by entering data by hand from the keyboard. This is the easiest approach when you have the raw data on paper, and need to type (some would say "keyboard," but not me) it into a file. This is particularly convenient when you have a small set of data.
The following is a small portion of the Apgar data that Jennifer and Colin are interested in. I have included only six variables and five cases to save space, but all of our analyses will be done on the complete data set.
OBS / APGAR / GENDER / SMOKES / WGTGAIN / GESTAT1 / 6 / 0 / 0 / 22 / 37
2 / 5 / 0 / 0 / 50 / 35
3 / 4 / 0 / 0 / 60 / 36
4 / 4 / 1 / 0 / 60 / 37
5 / 6 / 0 / 0 / 35 / 41
When you start up SPSS you will see a spreadsheet resembling the following figure.
The variable names appear in the grayed-out row and are currently labeled v1, var, var, etc. We want to start by entering the names (and characteristics) of our variables. So, double click on v1, and you will go to the following dialog box.
In the cell labeled v1, type the name of the variable. I would enter "obs," indicating that this column just numbers the observations, but you can type any name you wish. (A variable name cannot exceed 8 characters, and all will come out as lowercase, no matter what you type.) For this particular variable, that's all we need to do, and we'll ignore the other options.
If you double click on the box at the top of the second column, you will see a dialog box similar to the one above, except that the variable name is temporarily var0001. Double click on this and enter "apgar." But there are some other things about our data that we might want to specify. If you click on the Type button, you can indicate how many digits, and how many decimal points, will be displayed. Since Apgar scores are integers between 0 and 10, the data will be easier to read if you set the number of decimals to 0. There is no particular advantage that I have seen to limiting the number of digits to the left of the decimal.
People might not know what your variable name stands for, especially since it is limited to 8 characters. So the next thing to do is to provide a better label for the variable. If you click on the Labels button, you can enter a more descriptive name. "Apgar score" is probably not a great improvement here, but variable labels are often a useful device.
After naming observ and apgar, double click on the heading of the next column, and enter "gender" as the variable name. Click on the Type button; set the column to 0 decimal places, and close that dialog box. Then click on the Labels button, and you will see a dialog box similar to the one that follows.
Since "Gender" can sometimes be a confusing label, we should be more descriptive. I have labeled this variable "Sex of child." Because I know that the data are coded with a 0 for a male and a 1 for a female, so I entered 1 next to "Value" and Female next to "Value Label." When I click "Add," that designation will be entered in the box below, where I have already added "0 = Male." (If I forgot to click the Add button now, but hit "Continue" instead, I would get an error message asking if I really wanted to leave out "1 = Female.")
Suppose that there were missing values for Gender. If we just left the column blank, SPSS would enter a period in the cell, and treat that as a missing value. But suppose that we wanted to distinguish between different kinds of missing values. Sometimes data are missing because they weren't collected, sometimes because they "do not apply," sometimes because the person refused to answer, and sometime because the reported value is so absurd that it could not possibly be right. SPSS allows us to specify values for different kinds of missing values. For example, we could use 9 for "Not Reported," 99 for " Does Not Apply," and so on. To tell SPSS to treat these values (here, 9 and 99) as missing, click on the "Missing values" button and enter the various values that you have chosen to indicate types of missingness.
The only other button on the variable dialog box is Column Format. That allows us to set the width of the column. We sometimes want to make narrow columns so that we can see more variables on the screen. You can also do the same thing by putting the cursor between the columns in the grayed-out row and, when it turns into a double-headed arrow, dragging the boundary left and right.
Once you have entered all variable names and descriptions, you can start entering data. You simply put your cursor in the first cell, enter the value, and move on to the other cells. It doesn't matter whether you work down the page, or across, just so long as you put the numbers in the correct columns. Use whichever approach you find easier.
When you have entered all of the data (or even enough that you are worried about losing them), click on the File/Save menu, supply a file name, and press enter. This will save the data to a "system file," which includes not only the data, but the variable names and labels, information about missing values, and so on. Traditionally SPSS uses the file extension ".sav" for these files.
Importing ASCII files:
In the case of the Apgar data, we already have all of the variables entered in a text file. (That's the file I downloaded off the web and saved as apgar.dat. There are two ways of importing ASCII data, depending on how the data are entered in the file. I'll only speak about the Read ASCII Data command, although you can often open an ASCII file with the standard Open command.
To read an ASCII file, select Read ASCII File from the file menu and select free-format if each column is separated from its neighbor by a space, a comma, a tab character, or some other character. (This has become the way that data are most often entered.) You will now see the following dialog box.
We will indicate that a comma or space separates columns by clicking on the appropriate radio button. But SPSS won't know how to name these variables, or even how many there are, unless we tell it. So we enter the variable names, one at a time. In the example I have already entered "obs" as the name of the first variable, and clicked on "Add" The asterisk after obs indicates that it is a numerical variable. You should enter the names of all nine variables. The names are obs, case, apgar, sex, smokes, wtgain, gestat, prenatal, and anninc. Then click on OK and the data will be entered. (I have left in the variable "case" because it is in the original data. We have no need for that variable, but if we didn't enter something there SPSS would think that we had only 8 variables, and would mess things up when it read the file. (You wouldn't hand your mother a picture of three people and say "they are Mary and John.")
Ooops!, you pressed enter and now there is a problem. The first line of the resulting data on the spreadsheet is blank. That's because the data file was not exactly as I have described it. McClelland is a nice guy, and he thoughtfully put the names of the variables on the top of each column. Since we had left SPSS to think that the data were all numeric, it threw away those names. McClelland had also included a blank line before the data for readability, and that, too, confused the issue. Not to worry. Just click the "1" in the left margin, which will select the whole row. Then hit the delete key and you will get rid of those blanks.
Reading System Files:
Reading ASCII data is easier than entering it by hand, but the easiest way to enter data is to have someone else do the work. If someone else has entered the data, labeled all the variables, and saved the file with a name like apgar.sav, all you have to do is to click on the File/Open menu, navigate to that file, and press "enter." The data will be read in, the proper names and labels will be applied, and you will be all set to go. Since I did the work for you, you can just load the apgar.sav file. As a quick check, there should be 60 lines of data and the first column should read 1, 2, 3, ..., 60. If you see that, chances are that everything is fine.