The Indonesian

Sub-National Growth

and

Governance Dataset

14 March 2011
Acknowledgements

This dataset was constructed by a research team led by Dr. Neil McCulloch of the Institute of Development Studies, UK, under Ausaid research project on “Measuring the Economic Benefit of Better Local Economic Governance in Indonesia” No. ABN 62 921 558 838. The research team included Pak Agung Pambudhi and his staff at KPPOD, Ms. Sukma Yuningsih at the World Bank, Jakata and Dr. Eddy Malesky at the University of San Diego.

A large part of the data compilation and documentation was done by Ms. Sukma Yuningsih. Pak Boedi Rheza of KPPOD also helped to integrate KPPOD’s data and prepare the final dataset. We are grateful to BPS for permission to use the underlying data drawn from a variety of BPS surveys, as well as to the Asia Foundation for the use of the Economic Governance Index dataset.

Disclaimer

This dataset is distributed as a resource for researchers. We do not guarantee the accuracy of the data and accept no responsibility for it. Any questions regarding the data should be directed to the organisations responsible for the production of the original data from which this dataset is constructed. We are unable to offer support, assistance or updates of any kind.

The Indonesian Sub-National Growth and Governance Dataset

INTRODUCTION

It is widely believed that good local economic governance is important for boosting local economic performance. A research project, funded by Ausaid, and led by Dr. Neil McCulloch at the Institute of Development Studies in the UK, set out to test whether indeed this is the case (see details of the project and the final report). It did so by compiling a unique dataset which draws together data on the economic characteristics and performance of Indonesia’s districts (Kabupaten/Kota) between the years 2001 and 2007 along with data from a 2007 survey by KPPOD/Asia Foundation which measured the quality of economic governance at the district level (see This document gives background information to assist researchers to use the dataset for their own research. The dataset is in STATA 11 format.

TYPES OF VARIABLES AND DATA SOURCES

The bulk of the variables in the dataset are drawn from a range of standard surveys undertaken by the Baden PusatStatistik (BPS). However, it is important to note that the number of districts is not consistent across the original sources of data. For example, the regional GDP (GRDP) publication dataset, BPS’s Susenashousehold survey and Village Potential (Podes) surveys do not always have the same number of districts even when they are done in the same years. This is because of the different sampling frames used at different times of the year. In addition the de jure and de facto status of a new district is recorded differently by different institutions.

To be consistent, we have used the definition of the Ministry of Finance — an autonomous province/district is the one that receives DAU in the beginning of fiscal year. Between 2001 and 2009, the number of districts in Indonesia (excluding six non-autonomous district level governments in Jakarta) is: 2001 = 336, 2002 = 348, 2003 = 370, 2004 = 410, 2005-07 = 434, 2008 = 451, and 2009=477.

In this dataset, we use 2001 as our reference point, with 342 districts (336 districts and six non-autonomous district level governments in Jakarta) to avoid spurious changes resulting from the splitting of districts. That is, if districts subsequently split after 2001, we aggregated the data from the child districts so that our dataset shows a consistent series of variables for the geographical regions that comprised the districts in 2001. Table 1 shows the identification variables in our dataset.

Identification data

Table 1: Identification variables

id_m / WB coding for Kabupaten/Kota
kp09 / coding for Province (2009: 33 provinces)
kkk09 / coding for Kabupaten/Kota (2009)
name09 / Name of regions (2009)
province09 / Name of province (2009)
island / Name of Island
dummy_kota / (1=Kota, 0= Kabupaten)
jawa / (1= Java, 0= Off - Java)
EASTINDO / Dummy for Eastern Indonesia=1
Dsumat / DummySumatraIsland
Djawa / DummyJavaIsland
Dkalim / DummyKalimantanIsland
Dsulw / DummySulawesiIsland
Dnusa / Dummy Nusa Tenggara-Maluku Island
Dpapua / DummyPapuaIsland
parent_336 / WB code (base 2001, collapse districts into 336 districts)
name_336 / Names for 336 parent regions
split_342 / dummy 342 district since 2001(split to new regions=1; never split=0)

National Income data

The Gross Regional Domestic Product (PendapatanDomestik Regional Bruto, PDRB) is the market value of all final goods and services within a region during a given period of time.[1] The value of intermediate goods is not calculated because the value of the final good contains the value of all intermediate goods. The GRDP can be used as a measure of economic activity.[2] The data is provided by the Central Bureau of Statistic (BadanPusatStatistik, BPS) on a yearly basis. The GRDP data used in this paper are taken from RGDP by production sectors (year 2000–2007) which were kindly provided by BPS on request.

Based on the prices used, GRDP is classified into:

  1. Nominal GDP, the production is calculated by quantity of production in a specific year and the current price of the end product.
  2. Real GDP, production is calculated by quantity of production in a specific yearand the constant price of the base year (2000). This calculation enables one to see real production changes regardless of variations of end product prices.

Based on the sector’s contribution, GRDP is classified into:

  1. GRDP with oil and gas (PDRB Migas), the aggregates of all sectors within a specific year.
  2. GRDP without oil and gas (PDRB Non Migas), the aggregates of sector excluding Oil and Gas Mining and Oil and Gas Manufacturing subsectors.

The data is for Kabupaten/Kota GRDP (PDRB Kabupaten/kota) level and available for both nominal GRDP and real GRDP.

In addition, the GRDP is broken down into sectoral groupings. Table 2 shows how the three digit sectoral codes in the raw data have been converted into the single digit sectoral classifications in the dataset.

Table 2: Conversion from 3-digit classification of GRDP to GRDP Items in the dataset

GRDP Items 3 digit 2000-2007 / GRDP Items in the dataset 2000-2007
item / Sector / Sector
100 / Agriculture Total / 1 / Agriculture Total
200 / Mining and Quarrying Total / 2 / Mining and Quarrying, Oil and Gas Manufacturing Total
210 / Mining (Oil) / Mining (Oil)
220 / Mining (Others) / Mining (Others)
230 / Quarrying / Quarrying
300 / Manufacturing Total / Manufacturing Oil and Gas
310 / Manufacturing Oil and Gas
320 / Manufacturing Non-Oil and Gas / 3 / Manufacturing Non-Oil and Gas
400 / Electricity, Gas & Water Supply Total / 4 / Electricity, Gas & Water Supply Total
500 / Construction Total / 5 / Construction Total
600 / Trade, Restaurant & Hotel Total / 6 / Trade, Restaurant & Hotel Total
700 / Transport and Communication Total / 7 / Transport and Communication Total
800 / Financial Services / 8 / Financial Services
900 / Public Administration & Services / 9 / Public Administration & Services
998 / Without Oil and Gas / Without Oil and Gas
999 / Gross Domestic Product / Gross Domestic Product

In addition there are dummy variables indicating whether the district has oil and gas or not (migas* and MIGAS* and D1 and D2) and whether this is the main sector or not.

Table 3 provides a list of the key national income variables in the dataset.

cy / Real Income (GRDP) BY=2000
RGDPnoil_ / Real Income (GRDP) without oil & gas BY=2000
y / Nominal Income (GRDP)
GDPnoil / Nominal Income (GRDP) without oil & gas
agr_ / Agriculture, GRDP
min_ / Mining, Quarrying, Oil & Gas Manufacturing, GRDP
man_ / Non Oil & Gas Manufacturing, GRDP
enr_ / Electricity, Gas & Water Supply, GRDP
con_ / Construction, GRDP
trd_ / Trade, Restaurant & Hotel, GRDP
trs_ / Transportation and Communication, GRDP
fin_ / Financial Services, GRDP
ser_ / Services, GRDP
Shagr / Share of agriculture to total GRDP
Shmin / Share of mining to total GRDP
Shman / Share of non oil & gas to total GRDP
Shenr / Share of electricity to total GRDP
Shcon / Share of construction to total GRDP
Shtrd / Share of trade to total GRDP
Shtrs / Share of transportation to total GRDP
Shfin / Share of financial service to total GRDP
Shser / Share of service to total GRDP

Population data

There are in fact three different sources of population data:

  1. Interpolations from the Population Census;
  2. Susenas data; and
  3. The population measures used by the Ministry of Finance to calculate fiscal transfers.

In our analysis we use the first source because these are the official population figures published by the BPS. However, the Susenas population figures are also provided in the dataset.

Economic Performance Variables

We calculate and include a range of measures of economic performance and growth over the period of the dataset.

To calculate per capita growth we have used the Gross Regional Domestic Product (GRDP) divided by the population data from the BPS. GRDP per worker calculated using the estimate of the labour force from Susenas.[3] The measure of growth used in the analysis is the geometric growth rate over the period (i.e. ((final value – initial value)/initial value)^(1/number of periods) ). However, linear growth rates (i.e. year-on-year) are also calculated, as are logarithmic growth rates (i.e. [ln(final value) - ln(initial value)]/number of periods) and the average annual growth rate (i.e. the mean of the annual growth rates).

In addition, we have calculated the weighted (by GRDP) and unweighted average growth rates of the districts surrounding each district, to allow the exploration of spillovers between districts (see GAU and GAW variables).

See the section below on Consumption Expenditure for details of per capita consumption growth rates.

To get a sense of economic concentration (in the sense of whether the local economy is dominated by a particular sector), we also calculate the sectoralgini (gini_sector). The stata command for this variable is:

egengini_sector=inequal(sh_sector), by(year parent_336) index(gini) or

egengini_sector = gini(sh_sector), by(year parent_336)

Table 4 shows a list of the key per capita and per work economic performance variables.

PCY_ / Income per-total Population
PCYnoil_ / Income Without Oil & Gas per-total Population in
PLY_ / Income per-total Workers in
lnPLY_ / Ln per worker Real GDP,
lnPCY_ / Ln per capita Real GDP,
lnPCYnoil_ / Ln per capita Real GDP Without Oil & Gas,
y0706 / Liner Growth of Percap_RGDP 2006-2007
y0605 / Liner Growth of Percap_RGDP 2005-2006
y0504 / Liner Growth of Percap_RGDP 2004-2005
y0403 / Liner Growth of Percap_RGDP 2003-2004
y0302 / Liner Growth of Percap_RGDP 2002-2003
y0201 / Liner Growth of Percap_RGDP 2001-2002
y_noil0706 / Liner Growth of Percap_RGDP Without Oil & Gas 2006-2007
y_noil0605 / Liner Growth of Percap_RGDP Without Oil & Gas 2005-2006
y_noil0504 / Liner Growth of Percap_RGDP Without Oil & Gas 2004-2005
y_noil0403 / Liner Growth of Percap_RGDP Without Oil & Gas 2003-2004
y_noil0302 / Liner Growth of Percap_RGDP Without Oil & Gas 2002-2003
y_noil0201 / Liner Growth of Percap_RGDP Without Oil & Gas 2001-2002
yl0706 / Liner Growth of Income per Labor 2006-2007
yl0605 / Liner Growth of Income per Labor 2005-2006
yl0504 / Liner Growth of Income per Labor 2004-2005
yl0403 / Liner Growth of Income per Labor 2003-2004
yl0302 / Liner Growth of Income per Labor 2002-2003
yl0201 / Liner Growth of Income per Labor 2001-2002
gy0701 / Geometric Average Growth 2001-2007; post-decentralization income percapita
gy0501 / Geometric Average Growth 2001-2005; post-decentralization income percapita
gy_noil0701 / Geometric Average Growth Without Oil & Gas 2001-2007; post-decentralization income percapita
gyl0701 / Geometric Average Growth 2001-2007; post-decentralization income perlabor
lny0701 / Logarithmic Growth 2001-2007; post-decentralization income percapita
avy0701 / Average Growth 2001-2007; post-decentralization income percapita
ary0701 / Arithmetic Growth 2001-2007; post-decentralization income percapita
GAW0107 / Weighted Average Growth of neighbouring districts01-07
GAW0105 / Weighted Average Growth of neighbouring districts01-05
GAUn0107 / Unweighted Average Growth of neighbouring districts 01-07
GAUn0105 / Unweighted Average Growth of neighbouring districts 01-05
gini_sector / Coef. gini structure economy by grdp

Socio-economic variables

The dataset contains a large number of socio-economic variables drawn from the Susenas Core datasets from 2001 to 2007.

Education outcome indicators

-Net Enrolment Rate

The net enrolment rate is the number of pupils enrolled in (primary/junior secondary/senior high secondary) of level educations that are of the theoretical school-age group is divided by the population for the same age-group.

The primary school-age group (7-12 years old), the junior school-age group (13-15 years old), and the senior high school-age group (16-18 years old).

-Gross Enrolment Rate

The gross enrolment rate is the number of students enrolled in (primary/junior/senior high secondary) of level education, regardless of age divided by the population for the same age-group.

Labour indicators

Since the Labour Force Survey, Sakernas,is designed only to be representative at national and provincial levels, it cannot be used to obtain district level averages. Therefore, we used the Susenas data to provide labour indicators of the labour force, employment and unemployment at the district level. (Unfortunately, in the Susenas2005, there is no information about labor issues in the individual data from BPS, so we have missing labour data in 2005.) Table 5 shows the questions (since 2001) that determine the classification of an individual as participating in the labour force, and employed or unemployed. Figure 1 shows how these questions determine the classification.

Table 5: Questions that determine the classification into employed/unemployed

1. Did you work last week?
2. Did you work at least 1 hour last week?
3. Do you have work/business but currently is not active in either activities?
4. Are you looking for job?
5. Are you preparing a new business?
6. What are your reasons for not looking for job/preparing a new business?

Figure 1: Classification of Employed/Unemployed

Definition of labour force:

Labor Force: Persons of 15 years old and over who were working, temporarily absent from work but having jobs, and those who did not have work and were looking for work:

  1. Working: An activity done by a person who worked for pay or assisted others in obtaining pay or profit for the duration at least one hour during the survey week.
  2. Temporarily absent from work, but having jobs: activities done by a person who had job, but was temporarily absent from work for some reasons during the survey week.
  3. Did not have work and looking for work: All persons who did not have any job but were looking for work during the survey week. This is usually called open unemployment.
  4. Preparing for work
  5. The reason for not looking for job/preparing a new business:

-It’s impossible to get a job

-Already have a job, but not start to work yet

Field sector of work classification

Starting from 2001, BPS renewed its field of work classification system, from a simple 9 sector classification to a 3 digit KLUI system. To avoid possible errors in the coding of sub-sectors, we only use the first digit sectoral breakdown as in the old coding system.

  1. Agriculture sector
  2. Mining and excavation
  3. Manufacturing industry
  4. Electricity, gas, and water
  5. Building construction
  6. Accommodation services
  7. Transportation, storing, and communication
  8. Financial institution, real estate, and leasing
  9. Public, social, personal services
  10. Activity that does not have clear limitation rule

Because the raw Susenas data has information on labour issues for all individuals aged 10 or over, we have calculated the number of people working in each sector aged 10 or over, rather than 15 or over.

We have also calculated the proportion of people living in urban areas using Susenas data.

In addition, we include a measure of the concentration of the labour force (complementing the sectoral concentration of GDP above). This is the gini_secsus* variables. They are the gini coefficient of the sectoral shares of employment (as opposed to GDP) for each district.

Consumption expenditure

To get an estimate of overall welfare, we use the per capita expenditure data from Susenas i.e. household expenditure divided by household size. Household expenditure is divided into food and non-food consumption expenditure.

These are steps to create the key variables:

-Created per capita consumption expenditure from Susenas Core (household data). The average per capita expenditure per month is average household expenditure per month divided by number of household size; and the average annual per capita expenditure is average expenditure per month multiply by 12 and then divided by household size.

-The individual weights from the Susenas Core (individual data) are kept and merged with the household data.

-The data is then collapsed to give average per capita expenditure per month by district code (b1r1 b1r2) using individual weights.

Ethnic and Religious Fragmentation Indices

The dataset calculates indices of ethnic and religious fragmentation, similar to those calculated by Easterley and Levine (1997)[4]. We use Population Census 2000, Indonesian Bureau of Statistics (BPS).

EthnolinguisticFractionalization (ELF):

The ELF index can be defined as follows:

=

where is the share of population ethnic group iin district j. The index takes values between zero and one, where ELF equals to 1 implies a highly heterogeneous district and ELF equals to 0 refers to a perfectly homogeneous district. Indonesia Central Bureau of Statistics in Census 2000 traced 1068 ethnics across regions in Indonesia.

Religion Fractionalization:

In Census 2000, officiallythere are only 5 religions that are categorized in the census, namely Islam, Catholicism, Protestantism, Hinduism, Buddhism and Other.

=

where is the share of population religion group iin district j. The index takes values between zero and one, where RF equals to 1 implies a highly heterogeneous and RF equals to 0 refers to a perfectly homogeneous.

Table 6 shows the key variables drawn from the Susenas surveys.

POPUR / Population in Urban Area (susenas )
popsus / population susenas
popeduc / population age >=5 years old, susenas
age_prim / People in primary school age 7-12 years, susenas
age_secj / People in junior school age 13-15 years, susenas
age_sech / People in senior school age 16-18 years, susenas
Enrolp / People age (7-12) enrol in primary school, susenas
Enroly / People age (13-15) enrol in junior school, susenas
enrolls / People age (16-18) enrol in senior school, susenas
scl_sd / People ever/being in primary school susenas
scl_smp / People ever/being in junior school susenas
scl_sma / People ever/being in senior school susenas
none_scl / People never school
Scl / People ever/being school
prim / People enrol primary school, susenas
secj / People enrol junior school, susenas
sech / People enrol senior high school, susenas
NER_sd / NER - primary school susenas
NER_smp / NER - junior school susenas
NER_sma / NER - senior school susenas
enPRIM / Gross enrollment rate- primary school susenas
enSECJ / Gross enrollment rate- junior school susenas
enSECH / Gross enrollment rate- senior school susenas
enSECT / Gross Enrollment rate - junior+senior school susenas
PRIM / Share people ever/being in primary school per total population;susenas
SECJ / Share people ever/being in junior secondary school per total population;susenas
SECH / Share people ever/being in high secondary school per total population;susenas
lf_ / Labor force (age>=15 years), susenas
employ / Employed population (susenas)
unempl / Unemployed population (susenas)
POPEMP / Number of people work (age>=10 years), susenas
AGR / Number of People who works in Agriculture susenas
MIN / Number of People who works in Mining susenas
MAN / Number of People who works in Manufacturing susenas
ENR / Number of People who works in Energy & Electricity susenas
CON / Number of People who works in Construction susenas
TRD / Number of People who works in Trade susenas
TRS / Number of People who works in Transportation susenas
FIN / Number of People who works in Finance susenas
SER / Number of People who works in Services susenas
OTHER_SECTOR / Number of People who works in Other Sector susenas
SH_AGR / Share Agriculture worker to Total population susenas
SH_MIN / Share Mining worker to Total population susenas
SH_MAN / Share Manufacturing worker to Total population susenas
SH_ENR / Share Energi & Electricity worker to Total population susenas
SH_CON / Share Construction worker to Total population susenas
SH_TRD / Share Trade worker to Total population susenas
SH_TRS / Share Transport worker to Total population susenas
SH_FIN / Share Finance worker to Total population susenas
SH_SER / Share Services worker to Total population susenas
SH_OTHER_SECTOR / Share Other Sector worker to Total population susenas
sh_urban / Urbanization (portion of population that is urban), susenas
pcexp / Average monthly per capita consumption, susenas
ypcexp / Average annual per capita consumption, susenas
Lnypexp2001 / Ln Average annual per capita consumption 2001, susenas
ypcexp0706 / Liner Growth Annual Percapita Expenditure 2006-2007
ypcexp0605 / Liner Growth Annual Percapita Expenditure 2005-2006
ypcexp0504 / Liner Growth Annual Percapita Expenditure 2004-2005
ypcexp0403 / Liner Growth Annual Percapita Expenditure 2003-2004
ypcexp0302 / Liner Growth Annual Percapita Expenditure 2002-2003
ypcexp0201 / Liner Growth Annual Percapita Expenditure 2001-2002
GAUnypcexp0107 / Unweighted Average Growth Per Capita Expenditure of Neighbouring Districts 01-07
GAUnypcexp0105 / Unweighted Average Growth Per Capita Expenditure of Neighbouring Districts 01-05
gini_sectsus / Coef. gini of worker (>=10 years old) by sector susenas

Consumer Price Indices and Real Per Capita Expenditure