FREC 834: Determinants of Development
Fertility, Age Structure, Literacy, Income Inequality and GDP

The CIA maintains a database at https://www.cia.gov/cia/publications/factbook/index.html called the World Factbook that contains up-to-date cross-sectional data for more than 220 nations (including some you've never heard of), including information on their geography, demographics, politics, economics, infrastructure and military.  If you access their website and click on the Guide to Country Profiles link, the listed field names link to a field definitions page which links to the data. I imported a series of these data fields to an Excel spreadsheet, omitting some nations with missing data.  The first worksheet tab contains GDP, demographics and literacy data for 219 nations; the second contains Gini coefficients representing degree of income inequality for 121 nations.  Dowload the spreadsheet and use Excel's regression utility to perform the following analyses:

(For each of your regression analyses, you can copy and paste the appropriate data columns into a separate worksheet tab, then delete any rows with missing data.)
  1. How does income (per-capita GDP) affect fertility rates?
    A population's fertility rate is the average number of live births per female over her lifetime.  Create an XY-plot of Fertility rates on the Y-axis against Incomes (GDP/capita) on the X-axis.  Include a trendline with the regression equation and R-square in your plot. Then regress Fertility (Y range) against Income (X range) to analyze the statistical significance of the regression coefficients.  
    Since the trend in the plotted datapoints is obviously curved rather than linear, you can obtain a better linear regression model by transforming the data. Calculate the natural logarithms of fertility and income, then create an XY-plot of ln(F) against ln(I), including the regression trendline, equation and R-square. Then regress ln(F) vs. ln(I). Compare this log-log model against the original model: which fits the data better?
    The negative coefficient for Income implies that children are inferior goods: poor nations have significantly higher birthrates than rich nations. Explain why children are inferior goods.  
    Calculate the elasticity of demand for children.
    At what level of per-capita GDP does the log-log model predict a zero-population-growth fertility rate of 2.10?

  2. How does age structure affect per-capita GDP?
    Per-capita GDP depends on the percentage of working-age people in the population. The "youth effect" hypothsis states that countries with large proportions of children (14 and younger) are likely to have lower per-capita GDP.  Likewise, the "elderly effect" hypothesis states that countries with larger proportions of elderly (65 and older, presumably non-working) people may also have lower per-capita GDP. 
    Create an XY-plot of percent <15, percent 15-64 and percent 65+ (Y-axis) versus the natural log of per-capita GDP.
    Regress the natural logarithm of per-capita GDP against both percent <15 and percent 65+ to test the youth and elderly effects.  (If you regress per-capita GDP against all three percent Age variables the regression will fail because percent 15-64 is exactly correlated with the sum of percent <15 and percent 65+.) 
    Does your regression model support the youth effect hypothesis?
    Does it support the elderly effect hypothesis?
    Why might the elderly effect be insignificant in this model?

  3. How do income inequality (Gini coefficient), literacy rate and fertility rate affect per-capita GDP?
    The Gini coefficient is a measure of income inequality, calculated from the cumulative distribution of wealth by income percentile (Lorenz curve).  It is the ratio of the area between the Lorenz curve and the 45-degree line representing a perfectly equal income distribution, divided by the total area under the 45-degree line.   A nation with a low Gini coefficient (<0.3) will typically have a large middle class and relatively few very poor or very rich people.  A nation with a very high Gini coefficient (>0.6) will typically have extensive poverty, little or no middle class, and a small economic elite.   US income inequality has increased.  The Census Bureau has reported rising Gini coefficients:  0.394 in 1970, 0.403 in 1980, 0.428 in 1990, 0.462 in 2000 and 0.469 in 2005.
    Use the data in the second worksheet tab (121 nations for which Gini coefficients were calculated) to regress per-capita GDP against the Gini coefficient, overall literacy rate and fertility rate. 
    Use Excel's Data-Analysis Correlation tool to calculate the correlations between Gini coefficient, overall literacy rate and fertility rate.
    Explain the statistical significance of the regression model. 
    Explain the correlations between the three right-hand-side variables--do they make sense?
    Explain the economic development policy implications of this regression model.

  4. How do per-capita GDP and literacy affect longevity?
    Regress expected longevity against the natural logarithm of per-capita GDP and total literacy percentage.  (Be aware that "literacy" is not consistently defined; some countries claim their adult populations are 100% literate, although this is implausible.)
    Why would literacy affect longevity, even after accounting for differences in economic wealth?