20

AN ANALYSIS OF THE IMPACT OF POLICY AND OTHER FACTORS ON

CARBON MONOXIDE EMISSIONS

Growing environmental concern throughout the world has ignited a number of controversial views on how environmental cleaning should be achieved. In the context of developing countries in particular, several options have been put forth. Environmentalists have held that general level production and consumption are the prime drivers of environmental pollution and only in reducing economic activity can environmental performance be improved. Other opinions have favoured a reduction in population among developing countries as a key step to resolving the environmental dump, especially in their larger cities. Yet another perspective has emphasised that more than population and consumption, the answer lies in formulating and implementing appropriate policies.

To understand the relative significance of each of these opinions, it would be most useful to directly look at a cross-section of developing countries to see how their environmental performance has been linked to the above factors. However, a paucity of consistent data makes this exercise difficult. A possible alternative could be to conduct the relevant data analysis using information on the US, and to extrapolate, to the extent possible, the broad implications of the results for developing countries. Data relating to environmental emissions should ideally comprise information on Nitrogen Oxides, Carbon Monoxide, Volatile Organic Compounds and Sulfur Dioxide. Using these, a global air quality variable should be obtained. However, in light of the mathematical complications is would produce, it is simpler to look at one emission variable.


Response Variable: I will choose Carbon Monoxide (CO) per capita as the response variable, since this gas has been intimately connected to car related pollution, a source of growing concern among developing countries as their number of cars increase and pollute the environment. This data on CO emissions is available at www.epa.gov. I have divided this data by the population over time to reduce the time effect of emissions to the extent that population tends to go up over time. I will be using data from 1960 to 1989, as two policy variables we will be examining occurred during this period. The units of measurement are in millions of metric tonnes.

The plot of CO per capita versus time shows that for until about 1970, the emission levels per capita were quite level, they began to reduce consistently after that. This downward trend post about 1970 has continued through until 1989. The reason for the downturn in emission per capita could be related to the Clean Air Act of 1970. Interestingly, although the Clean Air Act of 1963 was in place earlier, according to the above graph, this policy did not seem to have the same dramatic effect on CO per capita levels.


Among the predictor variables, I chose GNP per capita. I would expect that this general indicator of the level of economic activity in the country would have some correlation with the level of CO per capita emissions. A look at the scatter plot of CO per capita and GNP per capita shows a similar pattern as the time series plot of the response variable, which is that there is a flat relationship initially, which then becomes quite obviously negative after about 1972.


I also chose gasoline consumption per capita over time, since the consumption gasoline over time (GCPC) is an important indicator of vehicle related pollution. To the extent that some gasoline may be used for purposes other than for consumption in vehicles, my result may be biased. However, from what I know, gasoline is used almost entirely for vehicles, so there should not be any significant issue with respect to this variable.

Once again, we see a similar pattern in the plot between COPC and gasoline consumption per capita, namely that the relationship remains quite level up to a point, after which it becomes quite negative. The effect here is a little delayed, since the downturn begins to occur after 1973. This could perhaps be the result of the policy put into place in 1970, and after a lag of a few years, adjustments in the industry may have resulted in higher gasoline consumption per capita being associated with lower levels of CO per capita.

Another predictor variable I chose was the extent of forest fires in the country. This variable is important because forest fires are known to release substantial amounts of CO in the air. The data for this predictor variable was obtained from www.usda.gov. and is measured in million acres of land destroyed by forest fires.


Here we do not see a distinct pattern emerge and it is difficult to guess whether our model will require this variable. However, it may be good to use it initially and see the results it gives.



Along with the above variables, I also chose to look at the CAA 1970 and CAA 1963 to see whether these policies had any dramatic impact on the CO per capita emissions.

I would expect both these variables to have an impact on the level of CO per capita emissions, although looking at the scatter plots, I would expect CAA of 1970 to be of greater value in our model.

In addition to the above variables, I will include time as a variable. As the scatter plots above show, time seems to play an important role in the manner in which the individual relationships of the predictor variables seem to be related to the response variable.



In running the regressions with CO per capita as my response variable and GNP per capita, gasoline consumption per capita, area under forest fire and the policies of 1963 and 1970 as my predictor variables, I get the following results:


The errors appear normally distributed. However, the plot of standardized residuals over time indicates that autocorrelation might possibly be a problem. However, I will not correct for it here, but will be aware that the results may have an element of bias in them.

Let us then look at our regression results (MINITAB output).

The regression equation is

COPC = 24443 + 0.0197 GNP P.C + 40.7 Pol63 + 50.9 Pol70 + 3.86 For.fires

+ 78.4 GCPC - 12.3 Year

Predictor Coef StDev T P VIF

Constant 24443 9185 2.66 0.014

GNP P.C 0.01970 0.03133 0.63 0.535 288.9

Pol63 40.72 11.44 3.56 0.002 2.6

Pol70 50.90 12.69 4.01 0.001 4.2

For.fire 3.859 2.737 1.41 0.171 1.4

GCPC 78.44 81.98 0.96 0.348 15.7

Year -12.256 4.758 -2.58 0.017 206.7

S = 16.48 R-Sq = 95.3% R-Sq(adj) = 94.2%

Analysis of Variance

Source DF SS MS F P

Regression 6 132965 22161 81.57 0.000

Residual Error 24 6521 272

Total 30 139486

Source DF Seq SS

GNP P.C 1 108698

Pol63 1 842

Pol70 1 15545

For.fire 1 5

GCPC 1 6073

Year 1 1802

Unusual Observations

Obs GNP P.C COPC Fit StDev Fit Residual St Resid

4 6378 608.57 637.23 10.16 -28.66 -2.21R

31 11272 386.29 382.17 14.21 4.12 0.49 X

R denotes an observation with a large standardized residual

X denotes an observation whose X value gives it large influence.

Durbin-Watson statistic = 1.16

The first thing that strikes me is the high VIF statistics, indicating a major multicollinearity problem. It is quite likely that GNP per capita, gasoline consumption per capita and time have a strong correlation. This could explain some of the low t-statistics for the coefficient estimates, while the F statistic is a high 81.57, indicating that we can strongly reject the hypothesis that this model has no explanatory power. The R square too is over 94, indicating that about 94 % of the variability in the response variable can be explained by our model.

Another thing which stands out is the fact that both the policy variables are positively correlated with CO per capita emissions. This is completely counterintuitive and can perhaps be explained by the fact that after about 1970, when the second CAA came into effect, the relationships between several of the explanatory variables and CO per capita changed. Since the slope of the relationship changed after this period, it may be more accurate to split the data explicitly into two and run separate regressions for the period before 1970 and after.



For the period before the second CAA act (i.e. between 1960 and 1970), let us look to see whether our assumptions are violated. The errors do not seem very normally distributed, but are not very non-normal either. Some outliers do seem to be present and we will come back to this. The residual shows some cyclical pattern, although this does not seem to be a major problem. No major increases or decreases in variance are visible.



Let us then see how our regression results look.

The regression equation is

COPC = 14592 + 0.0077 GNP P.C - 0.024 For.fires + 259 GCPC - 7.25 Year

+ 11.5 Pol63

Predictor Coef StDev T P VIF

Constant 14592 3937 3.71 0.014

GNP P.C 0.00772 0.01115 0.69 0.520 168.0

For.fire -0.0235 0.8061 -0.03 0.978 2.9

GCPC 258.98 65.04 3.98 0.011 105.4

Year -7.254 2.034 -3.57 0.016 108.6

Pol63 11.515 3.420 3.37 0.020 7.1

S = 2.047 R-Sq = 98.2% R-Sq(adj) = 96.4%

Analysis of Variance

Source DF SS MS F P

Regression 5 1144.08 228.82 54.60 0.000

Residual Error 5 20.95 4.19

Total 10 1165.04

Source DF Seq SS

GNP P.C 1 1001.07

For.fire 1 45.22

GCPC 1 19.03

Year 1 31.27

Pol63 1 47.50

Unusual Observations

Obs GNP P.C COPC Fit StDev Fit Residual St Resid

11 8134 627.796 625.946 1.842 1.850 2.07R

R denotes an observation with a large standardized residual

Durbin-Watson statistic = 2.35

The regression seems to show a reasonably good fit. The R square indicates that about 96 % of the variability in CO per capita emission can be explained by our model. We should be aware that some of the fit may possibly be a result of an autocorrelation problem which may be inflating the fit somewhat. The F-statistic indicates that we can strongly reject the null that the model has no explanatory power. Besides GNP per capita and forest fires show a high enough p-value to not reject the null that these variables have no explanatory power on their own.

In carrying out the regression diagnostics, no major outliers or leverage points were apparent.

Also, multicollinearity seems to be a problem with the model. Some of this must continue to be due to the strong relationship between GNP per capita and gasoline consumption per capita.




Clearly, the above plots indicate a very strong relationship between GNP per capita and time, gasoline consumption per capita and time and GNP per capita and gasoline consumption per capita. It would seem that the overall consumption level in the economy, as embodied by the GNP per capita data may be less useful than the gasoline consumption data, which is likely to relate more directly to CO emission. Therefore, GNP per capita can be dropped. Since time trend seems to be an important issue in this model, let us keep it. We therefore run a regression with CO per capita as the response and area under forest fire, CAA, 1963, per capita gasoline consumption and time as predictor variables. Before discussing the results, let us see whether our errors are more or less in keeping with our assumptions.





The errors look reasonably well distributed. There does not seem to be a problem of heteroskedasticity. The plot of standard residuals versus time does not indicate a serious autocorrelation problem either.

Let us now analyse the results and see whether this is an improvement on the previous one.

Regression Analysis

The regression equation is

COPC = 13020 + 12.8 Pol63 + 288 GCPC - 6.44 Year - 0.392 For.fires

Predictor Coef StDev T P VIF

Constant 13020 3073 4.24 0.005

Pol63 12.790 2.753 4.65 0.004 5.0

GCPC 287.64 47.92 6.00 0.001 62.7

Year -6.438 1.584 -4.06 0.007 72.1

For.fire -0.3916 0.5787 -0.68 0.524 1.7

S = 1.956 R-Sq = 98.0% R-Sq(adj) = 96.7%

Analysis of Variance

Source DF SS MS F P

Regression 4 1142.08 285.52 74.62 0.000

Residual Error 6 22.96 3.83

Total 10 1165.04

Source DF Seq SS

Pol63 1 192.48

GCPC 1 876.22

Year 1 71.62

For.fire 1 1.75

Durbin-Watson statistic = 2.22

As the results indicate, the multicollinearity problem is reduced now, but definitely not eliminated. Clearly, the relationship between time and gasoline consumption per capita has to do with this. However, since we are not breaking any assumptions of the model, we will stick with both the variables since they both seem to provide us with valuable information.

The model contained two values which were influential points: one was the year 1960 and the other, the year 1970. I ran a regression without these points and did not see a dramatic change in my results. Also, because I have so few points in my data set, I do not wish to lose further information and will keep the data from these two years intact.

The adjusted R square of close to 97 % indicates that a very high proportion of the variability in the CO per capita emission can be explained by our model. The F-statistic indicates that we can strongly reject the null hypothesis that the model has no predictive power. In addition, the low p-values of all the explanatory variables, except the forest fire variable indicate that individually, they do contribute to explaining variability in CO per capita emissions. As we had seen in the scatter plot of CO per capita and forest fire, no apparent relationship seemed to exist between the two variables. This is now borne out by the high p value of 0.524. It may be worthwhile to remove this variable as it seems to be adding noise to our model. Hence we run a regression with CO per capita as response variable and gasoline consumption per capita, time and CAA, 1963 as predictors.