MM104 Biostatistics 2010-11

.

SPSS EXERCISES 1. Descriptive statistics.

1. The UNICEF data set water.sav contains information about the percentage of the population of each non-industrialized country that had access to «improved drinking water sources» in 2004[1]. Each row represents one country. The variables of interest are named urban and rural. Each of these two variables contains the percentage of the populations that have access to improved drinking water sources.

α) What shape do the two distributions appear to have?

[ Analyze ►Descriptive statistics ► Explore...... ]

β) Are there any outliers? What conclusions can be drawn about the proportions of the populaitons that have access to improved drinking water sources?

2. Αnswer the following questions using the file Rouvas20.sav. This file contains measurements from 20 nursery school children from the region of Rouvas.

α) Use the variable systco to calculate the proportion of children with high systolic bloood pressure.

[Analyze ►Descriptive statistics ► Frequencies...... ]

β) Are the proportions of boys and grils with high systolic b.p. similar[2]? [answer w/o attempting a hypothesis text]

[Analyze ►Descriptive statistics ► Crosstabs...... ]

γ) Does the average systolic bp (avsyst) appear to differ according to obesity status?

[Transform► Recode…

Analyze ►Descriptive statistics ► Explore...... ]

3. From the book «Αρχές Βιοστατιστικής» σελ 36 & σελ 62. Τhe file «unicef» is an Excel file containing the proportions of infants with low birthweight (variable name lowbwt). Open it in Excel and then in SPSS You will notice that some countries do not provide data. Does that create a problem in SPSS?

a) create a box plot for these data

b) do the data appear skewed?

c) are there outliers?

d) calculate the mean and median of the data

e) which of the 2 above measures is preferable as a measure of central location? Why?

4. Create an SPSS data file containing the data presented below (in two columns)

To confirm the mean value and variance stated below, use the command

Data ►Weight cases ...... using as Frequency variable the variable that contains the frequencies.

Then calculate the summary statistics.

------

The following SPSS commands are used in the exercises:

Data

Weight cases ......

Analyze

Descriptive Statistics ►

Explore......

Analyze

Descriptive Statistics ►

Crosstabs......

Analyze

Descriptive Statistics ►

Frequencies......

Transform

Recode...

------

[1]http://www.unicef.org/progressforchildren/2006n5/index.html “It is estimated that unsafe water and a lack of basic sanitation ands hygiene every year claim the lives of more than 1.5 million children under 5 years old from diarrhoea.”Improved drinking water sources: Household connection , public standpipe, Borehole, Protected dug well, Protected spring , Rainwater collection

[2] Εthe frequencieis are small here, so small changes (eg 2 girls instead of 3) will result in relatviely large changes in the proporions.