DEMOGRAPHY
CHAPTER ONE
EVALUATION OF DEMOGRAPHIC DATA
1.Errors Affecting Quality Of Data
The sources of data include censuses, sample surveys, vital registrations, Population register etc.
The quality of data from these sources varies from time to time and from places to place.
1.1There are however two types errors:
i.Coverage errors
ii.Content errors
i)Coverage Errors:are due to omission minus-erroneous inclusion of
some people. In most cases this is due to political reasons. Under enumerations: in this case, households, persons, villages and hamlets are left out. Coverage error may arise as a result of unstable population like nomads, and those having dual residence, it may also arise as a result of scattered population, areas of coverage too large for enumerators, inaccessibility of some areas etc.
ii)Content Errors:This refers to instances where the characteristics of
persons counted in a census or enumeration are incorrectly reported or tabulated or coded. Respondent is not the only sources of errors. Errors can arise at any state of data management – the interviewers or enumerators and coders can contribute to the errors.
1.2Age Errors
Age errors are the most significant in demographic data and they have been more intensively examined than any other reporting errors because:
- They are easy to identify
- Measurement techniques can be more easily developed
- Most classifications are based on age data
Age data can be refined for further analysis. In fact Demographics and Actuarians have special means to identify age errors and to refine them for further analysis and computation of life table.
In view of our intention to adjust age based data, coverage errors and content errors will be examined in with respect to information age.
Errors in tabulated data on age include enumeration errors with respect to coverage and content
- Coverage error commonly find in census are of 2 types e.g individuals in a given age group may be missed out or erroneously included. Omission of individuals in an age group is called Gross Under-enumeration. The net balance of those missed out in an age group and those erroneously included is refered to as Net Under-enumeration. i.e Net Under enumeration = (Omission minus those erroneously included)
- On the content error, it is possible that ages of individuals included in the enumeration may have been erroneously reported by the respondent, erroneously estimated by the enumerator or erroneously allocated by the census office. In the tabulated data from any enumerator it would show that some persons in an age group incorrectly reported out of that age group into ages, which are higher or lower, while others with higher or lower ages reported into the age group. Form the reporting into and out of age group. Gross Misreporting will be calculated also known as Response Variability of Age. Such calculations are based on the report of individuals. If however all individuals in each age group may offset reporting out of the age group hence the Net Misreporting is determined. This also know as Response bias.
- Net Census error: The combination of Net Under-enumeration (NU) and Net Misreporting (NM) is know as Net Undercount or Net Census error NCU = NU + NM
E.g Take for instance Group of Age 35
Net Under Enumeration:Those whose age are 35 and completely omitted from the enumeration and those whose age are erroneously included as 35 and should not be part of the population
Net Misreporting:Those who reported as age 35 include:
a.Those whose their correct age are 35
b.Those whose their ages are above or below 35 but who reported as 35 years of age.
c.Those that erroneously reported out of age 35 into higher or lower age. ‘b’ is partly offset by ‘c’ and the difference is Net Misreporting error for age 35.
Where the population data are grouped errors particularly Net Misreporting error tend to reduce.
1.3Causes of Errors in Age Data
i.Ignorance of correct Age – most especially illiterate population
ii.Carelessness in reporting and in recording age
- There is tendency to record or state age ending in certain figures known as preferred digits e.g age ending in 0 or 5
- There is a possible sub-conscious aversion to certain numbers. Some people don’t like certain numbers e.g the British don’t like figure 13. In this part of the world we also hate figure 9 and 30.
- Exaggeration of length of life at advance ages.
- Mis-placement of age ……. From some motive such as economic, social, political or purely personal.
It is however necessary to note that deficiencies is tabulated data vary in single years of age of grouped data.
1.4Errors In Single years of age
The principal error in the single years of age is age heaping. Also referred to as age preference or digit preference. This is the tendency to report certain ages at the expense of others. (Although we have other types of errors, which may affect single years of age such as age misreporting, non-reporting of age or mis-assignment of age). For instance age 0 is usually under enumerated at the expense of age 1
1.5Detection of Errors
Errors in single years of age can be detected through
- Post Enumeration Survey
- Through Record Matching
- Through the Use of Indexes
1.6Indexes Based on Rectangularity
The indexes to be used depend on the assumptions made. The simplest device is the one based on rectangular distribution where we the assumption of equal numbers in each age is made over some age range which included and preferably centred on the age being examined for examples a population in ages 39, 40 and 41 denoted by p39, p40, and p41 may be used to examined preference for age 40.
If in certain year, preference for age 40 is to be determined based on the assumption of rectangularity of population distribution. This can be calculated as the ratio of population 40 into 1/3 of population ages 39, 40, and 41 expressed in percentages. This is for 3 years age range.
i.
If 5 year age range is used it will therefore be expressed thus.
=
e.g Population of Philippines in 1960 for the 3 year age range
=
=x100=165.71
Where as for 5 years age range
=
=x100=164.44
The higher the index the higher the concentration. Index of 100 indicates no concentration.
1.7Whipples Index:
This is another measurement of age heaping. This index has been developed to reflect preference for or avoidance of a particular digit. For a large age range, the assumption about true form of distribution may be that of linearity i.e where true figures formed an arithmetic progression or where they decrease by equal amount form age to age over the age range.
The whipples index is one of the indexes where assumption of rectangularity or linearity is made.
If the assumption of rectangularity is made in 10 year age range, the heaping on terminal digit zero in the age range 23 to 62 years may be measured by comparing the sum of the populations at the ages ending in zero in this age range 23-62 with 1/10th of the total population in the age range.
For example all ages ending in zero will be P30 + P40 + P50 + P60 comparing with 1/10th of summation ofP23 to P62 i.e P23 + P24 …… + P62 multiply by 100.
=P30 + P40 + P50 + P60X 100
1/10th (P23 + P24 + P25 + …….+ P60 + P61 + P62
For ages ending 0 i.e ten year age range
=X 100
In the same manner, employing the assumption of rectangularity or of linearity in a 5 yr age range, heaping on terminal digit 0 and 5 combined in the age range 23-62 may be measured by comparing the sum of the population at ages in this range ending in zero and five with 1/5 of the total population in the age range.
=P25 + P30 + P35 +….+ P 55 + P60X100
1/5th (P23 + P24 + P25 + …….+ P60 + P61 + P62
For ages ending 0 i.e ten year age range
=X 100
ASSUMPTIONS
1.Assumption of rectangularity or majorly linearity in age distribution
2.Assumption that age heaping are recorded only in ages ending in ‘0’ and ‘5’
3.Assumption that population is close in migration or are equally affected by
migration at all ages
4.Assumption that population distribution is not affected by unusual occurrences.
5.Assumption that population of those at the ages of childhood and those at extreme
old ages are affected by other types of errors other than by preference for specific terminal digits
e.g
Population of Turkey
Age / Male (‘000’) / Female (‘000’)25
30
35
40
45
50
55
60
Population of ages ending in ‘0’
Population of ages ending in ‘5’
Population of ages ending in ‘0’ and ‘5’ / 191
301
283
244
201
158
78
121
824
753
1,577 / 393
471
373
409
260
350
150
297
1,527
1,176
2703
Whipples Index
For males age ending 0
=301+244+158+121X100
1/10 (3,601)
= 824X100
360.1
=228.8 or 299
For the females age ending in 0=386.5822785 ≈ 387
For the males ages ending in 5=209
For females age – ending in 5=1,176 x 100 = 297.72 or 298
1/10 (3,601)
The females has the higher tendency to concentrate on ages ending in 0 and 5 than males.
This may be due to
1.High illiteracy rate among females
2.They are less interested in age most especially in the traditional set up.
3.Males in most cases report the age of females on their behalf and there is
tendency to use rounded figures for females
Note:Whipples index deals with measuring heaping of digits ending in zero and five only.
The heaping varies from 100 showing no preference for digit ending in zero or five to 500 showing that only digits ending in zero or five were reported.
LIMITATIONS
- Whipples Index is applicable only where age is reported in single years.
- The choice of age range 23 to 62 is largely arbitrary. It is based on Whipple cultural setting. The ages of childhood and extreme old age according to whipples are often excluded since it is believed to be more affected by other types of errors than by preference for specific terminal digit.
- The assumption of rectangularity or linearity in age distribution is less applicable in a real situation. Age distribution of population is not rectangular.
- It does not take into consideration that age distribution may be affected by unusual occurrence – baby boom, war flood disaster and famine.
- It does not take care of mortality.
- Some demographers are of the views that whipples index are not efficient method of age preference since it measures heaping on ages ending in 0 and 5 only.
The method could however be extended to other digits but whipples measure only ages ending in ‘0’ or ‘5’.
ADVANTAGES
- it is very easy to calculate
- It can be easily compared among population of various countries
Scale of Reliability
Quality of the data-Whipples Index
Highly Accurate-less than 105
Fairly Accurate-105-109.9
Approximate -110-124.9
Rough -125 – 174.9
Very Rough-175+
1.8Myers Blended Index
Myers index is an ingenious device, which reflects the preference or the dislikes for each of the ten digits 0,1,2,3,4,5,6,7,8 & 9
It is applicable where age is given in single years. The method is called blended because e it was developed to avoid the bias in indexes calculated, in the manner the whipples index was done due to the fact that numbers ending in zero would be larger than numbers ending in 1 & 9.
The Underline Assumptions
a.It assumes that ages are linearly related.
b.It assumes the likelihood for age heaping (or avoidance) in each age group.
c.the Myers method assumes that in the absence of systematic irregularities
in the reporting of age, a blended sum at each terminal digit should be approximately equal to the 10% of the total blended population.
If the sum of any given digit exceed 10% of the total blended population, it
indicates over selection of ages ending in that digit i.e digit preference.
On the other hand, a negative deviation or sum less than 10% of the blended total indicates under selection of ages ending in that digit i.e digit avoidance.
Unlike whipples index which is restricted to ages 23-62, Myers is computed to the nearest 9. However the starting point is age 10 but the terminal point will be the nearest 9 to the highest age given e.g. for age 10-75, the method will cover 10-69, where it is up to 90 years the method will cover 10-89.
Data Requirements
Population distribution by single year of age within age range 10-69 or 10-79 or in some cases 10-89 from census data.
Procedure of Calculating Myers Index
Step 1:Sum the population ending in each digit over the whole range starting with the lower limit of the range of population distributed in single years up to 89 (i.e. 10-89)
Sum (10,20,30,….80)
Sum (11,21,31,….81)
Sum (19,29, 39,49,…..89)
Step II:Sum up the population excluding the first population combined in step I e.g
Sum (20,30,40,….80)
Sum (,21,31,41,….81)
…
…
…
Sum (,29, 39,49,…..89)
i.e ages 10,11,12,23,…. 19 are excluded
Step III:Weight the sum in steps I & II to obtain blended population on each terminal digit and add the result to obtain Grand blended population
Weight (W)
1 and 9 for the ‘0’ digits
2 and 8 for the 1 digits
3 and 7 for the 3 digits
….
….
10 and 0 for the 9 digits
Step IV:Convert the distribution is Step III into percent.
Step V:Take the deviation of each percent in step IV from 10.0. The value for each terminal digit is expected to be zero (0) after taking the deviation.
The result in step IV indicates the degree of preference or avoidance of age in each terminal digit.
Bi=Blended Population i range from 0-9
Grand Blended Population=
Magnitude of Preference=
A summary index for preference for all terminal digits known as Myer Blended meth or index is taken as ½ the sum of the deviations from 10% each taken without regard to sign ½ the sum of magnitude of preference.
Myers Blended
The Myers Blended index is an estimate of the minimum proportion of persons in population for whom an age with an incorrect final digit or terminal digit is reported.
Therotically, the range of the index is between 0 and 90. While zero ‘0’ represent heaping, 90 represent the reporting of all ages at a single digit.
Population of the Philippines by single years of age: 1960.
Age / Number / Age / Number / Age / NumberTotal
Under 1 year
1yr
2yrs
3 yrs
4 yrs
5 yrs
6 yrs
7 yrs
8 yrs
9 yrs
10 yrs
11 yrs
12 yrs
13 yrs
14 yrs
15 yrs
16 yrs
17 yrs
18 yrs
19 yrs
20 yrs
21 yrs
22 yrs
23 yrs
24 yrs
25 yrs
26 yrs
27 yrs
28 yrs
29 yrs
30 yrs / 27,089,685
786464
888,180
963,230
969,309
965,232
957,698
928,673
938,899
841,636
702,492
841,356
581,400
796,786
619,293
596,592
565,714
566,942
538,891
651,318
491,441
565,801
494,895
515,823
456,892
425,212
522,203
358,549
376,221
395,766
300,610
535,924 / 31 yrs
32 yrs
33 yrs
34 yrs
35 yrs
36 yrs
37 yrs
38 yrs
39 yrs
40 yrs
41 yrs
42 yrs
43 yrs
44 yrs
45 yrs
46 yrs
47 yrs
48 yrs
49 yrs
50 yrs
51 yrs
52 yrs
53 yrs
54 yrs
55 yrs
56 yrs
57 yrs
58 yrs
59 yrs
60 yrs
61 yrs
62 yrs / 222,086
318,481
246,260
233,700
401,936
242,659
242,462
316,210
225,207
434,156
126,632
217,881
169,167
151,142
319,118
160,329
160,855
237,287
155,094
313,636
78,534
128,935
93,279
95,715
163,093
87,754
71,828
93,049
72,206
275,436
31,299
49,634 / 63 yrs
64 yrs
65 yrs
66 yrs
67 yrs
68 yrs
69 yrs
70 yrs
71 yrs
72 yrs
73 yrs
74 yrs
75 yrs
76 yrs
77 yrs
78 yrs
79 yrs
80 yrs
81 yrs
82 yrs
83 yrs
84v
85 yrs
86 yrs
87 yrs
88 yrs
89 yrs
90 yrs / 40,154
34,381
102,440
26,445
35,311
40,711
20,921
136,771
13,000
28,017
16,662
14,490
50,558
15,010
11,878
23,353
9,212
73,741
5,532
9,331
5,653
5,089
18,604
4,803
5,617
4,388
4,000
57,111
|0| whipples index is from 23-62
Calculation of preference indexes for Terminal
Digits by Myers Blended index for the Philippines 1960
Terminal digit ‘a’ / Population with terminal digit ‘a’ / Weight for / Blended population / Deviation of present 10.0 col 6-10.0Starting at age 10 +’a’ / Starting at age 20 +’a’ / Col. 1 / Col. 2 / No. (1) x (3) + (2) x (4) / Percent Distribution
(1) / (2) / (3) / (4) / (5) / (6)
0…
1…
2…
3…
4…
5…
6…
7…
8…
9… / 3,176,821
1,553,378
2,064,888
1,647,360
1,556,321
2,143,666
1,462,491
1,443,063
1,762,082
1,278,691 / 2,335,465
971,978
1,261,102
1,028,067
959,729
1,577,952
895,549
904,172
1,110,764
787,250 / 1
2
3
4
5
6
7
8
9
10 / 9
8
7
6
5
4
3
2
1
0 / 24,196,006
10,882,580
15,071,378
12,757,842
12,580,250
19,173,804
12,924,084
13,352,848
16,969,502
12,786,910 / 16.06
7.22
10.00
8.47
8.35
12.72
8.58
8.86
11.26
8.49 / +6.06
-2.78
-
-1.52
-1.65
+2.72
-1.42
-1.14
+1.26
-1.51
TOTAL / 150,695,204 / 10.00 / 20.07
Summary
Index of age
Preference / = Total
Index
2 / 10.04
Note:A summary index for preference for all terminal digits can be taken as ½ the sum
of the deviation from 10.0% each taken without regard to sign thus ½ (20.07) = 10.04
For the population of Philippines examined, the summary of preference is 10.04. This indicates heaping by the heaping is not a serious one hence it could be said that the population is moderately reported.
Very small deviations from 100 or 10 or 0 shown in various measures of heaping don’t necessarily indicate heaping and they should be disregarded.
Limitations
- The assumption that preferences for all digits are the same is not always true. This ‘actual’ or ‘true’ population in any single year of age does not exactly equal 1/5 or 1/10 of the five and ten year age groups respectively.
- The assumption that age distributions of population are linearly related may not be a reality. Age distribution of population may have irregular fluctuations depending largely on the past trend of births and deaths. The digit preference cannot be measured precisely as distinctions between the errors due to digit preference, other factors and real fluctuations can hardly be made.
- The method as may be observed does not cover the entire age range.
- Beside the method can only be applied only where ages are presented in single years.
- It does not identify error of compensation.
- Myers index can produce different results because the age at which the calculation starts or ends may vary.
- The method also does not take care of coverage error but centres only on content error i.e. error of misreporting.
Usefulness
- Above not withstanding, Myers index is one of the widely applied methods of measuring age preference at each digit.
- It assists in measuring the accuracy of data.
- It also facilitates comparison among different population, more so, when the same age range is used in the calculation.
- It is also very good method of measuring differentials in digit preference by sex.
- The method helps in selecting five-hear age grouping that can be used to minimize errors.
1.9Horizontal Method of Analysis
Sex Ratio
Sex Ratio can be calculated for single years of grouped data. Sex ratio is the number of males per hundred females.
Mx 100
F
1991 Population Census
Sex Ratio=759,986x100
For EkitiState775,804
=97.96
Sex Ratio at Birth
This is expressed as the number of male births to a hundred females births.
=PBMx100
PBF
ASSR is the number of males per hundred females in a particular age group
ASSR =5Pmxx100 where x = 10
5Pfx
=Pm10 - 14x100
Pf10-14
=5,812,538x100
5,336,143
=108,95
ASSR can be used in evaluating census age data. The general pattern of age specific sex
Ratio is such that it approximate sex ratio at birth in younger ages and then decline
gradually as it tends towards advance ages.
Where we have similar sex ratio at birth, ASSR are usually the same where migration is not considered. It should however be noted that mortality is affecting sex ratio at older ages. More women survived than men especially after child bearing.
1.10Vertical Analysis
Age Ratio (AR)
This can be used in evaluating age data. Age ratio is usually defined as the ratio of the population in the given age group to one half of the population in the two adjacent age groups particularly, when the UN method is used.
AR= 5Pfxwhere x = 10
½ (5Pfx-5 + 5Pfx + 5)
= P10-14where x = 10
½ (Pf5 -9 + Pf15 + 19)
5Px=is the population for age x plus 5
5Px– 5=Age group preceding the age group being examined
5Px + 5=Age group following the age group being examined
The computed age ratios are then compared with an expected value, which is usually one hundred e.g.
Age 0 - 4=J