Chapter 1

Introduction

CONTENTS

1.1 What is Statistics

1.2 Populations and Samples

1.3 Descriptive and Inferential Statistics

1.4 Brief History of Statistics

1.5 Computer Softwares for Statistical Analysis

1.1 What is Statistics

The word statistics in our everyday life means different things to different people. To a football fan, statistics are the information about rushing yardage, passing yardage, and first downs, given a halftime. To a manager of a power generating station, statistics may be information about the quantity of pollutants being released into the atmosphere. To a school principal, statistics are information on the absenteeism, test scores and teacher salaries. To a medical researcher investigating the effects of a new drug, statistics are evidence of the success of research efforts. And to a college student, statistics are the grades made on all the quizzes in a course this semester.

Each of these people is using the word statistics correctly, yet each uses it in a slightly different way and for a somewhat different purpose. Statistics is a word that can refer to quantitative data or to a field of study.

As a field of study, statistics is thescience of collecting, organizing and interpreting numerical facts, which we call data. We are bombarded by data in our everyday life. The collection and study of data are important in the work of many professions, so that training in the science of statistics is valuable preparation for variety of careers. Each month, for example, government statistical offices release the latest numerical information on unemployment and inflation. Economists and financial advisors as well as policy makers in government and business study these data in order to make informed decisions. Farmers study data from field trials of new crop varieties. Engineers gather data on the quality and reliability of manufactured of products. Most areas of academic study make use of numbers, and therefore also make use of methods of statistics.

Whatever else it may be, statistics is, first and foremost, a collection of tools used for converting raw data into information to help decision makers in their works.

The science of data - statistics - is the subject of this course.

1.2 Populations and Samples

In statistics, the data set that is the target of your interest is called a population. Notice that, a statistical population does not refer to people as in our everyday usage of the term; it refers to a collection of data.

Definition 1.1
A population is a collection (or set) of data that describes some phenomenon of interest to you.
Definition 1.2
A sampleis a subset of data selected from a population.

Example 1.1 The population may be all women in a country, for example, in Indonesia. If from each city or province we select 50 women, then the set of selected women is a sample.

Example 1.2 The set of all softdrink bottles produced by a company is a population. For the quality control 150 softdrink bottles are selected at random. This portion is a sample.

1.3 Descriptive and Inferential Statistics

If you have every measurement (or observation) of the population in hand, then statistical methodology can help you to describe this typically large set of data. We will find graphical and numerical ways to make sense out of a large mass of data. The branch of statistics devoted to this application is called descriptive statistics.

Definition 1.3
The branch of statistics devoted to the summarization and description of data (population or sample) is called descriptive statistics.

If it may be too expensive to obtain or it may be impossible to acquire every measurement in the population, then we will want to select a sample of data from the population and use the sample to infer the nature of the population.

Definition 1.4
The branch of statistics concerned with using sample data to make an inference about a population of data is called inferential statistics.

1.4 Brief History of Statistics

The word statistics comes from the Italian word statista (meaning “statesman”). It was first used by Gottfried Achenwall (1719-1772), a professor at Marlborough and Gottingen. Dr. E.A.W. Zimmermam introduced the word statisticsto England. Its use was popularized by Sir John Sinclair in his work “Statistical Account of Scotland 1791-1799”. Long before the eighteenth century, however, people had been recording and using data.

Official government statistics are as old as recorded history. The emperor Yao had taken a census of the population in China in the year 2238 B.C. The Old Testament contains several accounts of census taking. Governments of ancient Babylonia, Egypt and Rome gathered detail records of population and resources. In the Middle Age, governments began to register the ownership of land. In A.D. 762 Charlemagne asked for detailed descriptions of church-owned properties. Early, in the ninth century, he completed a statistical enumeration of the serfs attached to the land. About 1086, William and Conqueror ordered the writing of the Domesday Book, a record of the ownership, extent, and value of the lands of England. This work was England’s first statistical abstract.

Because of Henry VII’s fear of the plague, England began to register its dead in 1532. About this same time, French law required the clergy to register baptisms, deaths and marriages. During an outbreak of the plague in the late 1500s, the English government started publishing weekly death statistics. This practice continued, and by 1632 these Bills of Mortality listed births and deaths by sex. In 1662, Captain John Graunt used thirty years of these Bills to make predictions about the number of persons who would die from various diseases and the proportion of male and female birth that could be expected. Summarized in his work, Natural and Political Observations ...Made upon the Bills of Mortality, Graunt’s study was a pioneer effort in statistical analysis. For his achievement in using past records to predict future events, Graund was made a member of the original Royal Society.

The history of the development of statistical theory and practice is a lengthy one. We have only begun to list the people who have made significant contributions to this field. Later we will encounter others whose names are now attached to specific laws and methods. Many people have brought to the study of statistics refinements or innovations that, taken together, form the theoretical basis of what we will study in this course.

1.5 Computer Softwares for Statistical Analysis

Many real problems have so much data that doing the calculations by hand is not feasible. For this reason, most real-world statistical analysis is done on computers. You must prepare the input data and interpret the results of the analysis and take appropriate action, but the machine does all the “number crunching”. There many widely-used software packages for statistical analysis. Below we list some of them.

  • Minitab (registered trademark of Minitab, Inc., University Park, Pa)
  • SAS (registered trademark of SAS Institute, Inc., Cary, N.C.)
  • SPSS (registered trademark of SPSS, Inc.,Chicago)
  • SYSTAT (registered trademark of SYSTAT, Inc., Evanston,II)
  • STATGRAPHICS (registered trademark of Statistical Graphics Corp., Maryland).

Except for the above listed softwares it is possible to make simple statistical analysis of data by using the part “Data analysis” in Microsoft EXCEL.