Regression Exercises
- Every year the College Board publishes average SAT scores for
college-bound seniors by state, and every year a chorus of morons
point out that states which spend a lot per pupil on K-12 public
education generally have lower average SAT scores that states that spend
relatively little per pupil. For example, New York spends
more than twice
as much per pupil as Mississippi, and their kids average more than 200
points lower on the SAT (sum of reading, math and writing), which
just goes to show you how our wasteful, bloated public education system
has been ruined by all those incompetent overpaid bureaucrats, lazy
teachers protected by corrupt unions,
negligent parents, dope-addled sex-crazed kids
and their rap music, somebody oughta do something, blah blah harrumph
harrumph!
Check out some recent data
for yourself. (Note the College Board's cautionary note in the second line of the
spreadsheet.) Use Excel's Regression utility to test the following
model:
SUM08 = B0 + B1EXPPUPIL06
Are the critics onto something? Is the negative B1 coefficient
statistically significant?
How much of the total variation in SAT scores does
this model explain?
Use Excel's Correlation utility to evaluate the pairwise correlations
between states' mean SAT scores, per-pupil spending and SAT test participation rates.
Explain why test participation is positively correlated with per-pupil spending.
Explain why a higher rate of test participation would reduce a state's
average SAT score.
This is a textbook example of an omitted variable problem.
Maybe per-pupil spending has something to do with who takes
the SAT and who doesn't.
Estimate and
interpret an alternative model that accounts for variable
test participation rates:
SUM08 = B0 + B1EXPPUPIL06 +
B2PARTICIP08
Explain the logic behind these model results.
How much of the total variation in SAT scores
does this model explain?
-
Download crime_by_state.xls.
The variable definitions for this spreadsheet are as follows:
VIOCRIM06 Violent Crimes per 100K Persons, 2006
MURDER06 Murders per 100K Persons, 2006
POP1864_06 Adult Population (ages 18-64) in 2006
COPS06 Public Law Enforcement Officers, 2006
INPRISON06 Incarcerated Population, 2006
DEATHPNLTY =1 if State has Death Penalty; =0 Otherwise
DEATHROW06 Prisoners on Death Row, 2006
EXECUTED Number of Executions between 1976 and 2006
PCTINCARC06 Percent of Adult Population Incarcerated, 2006
PCTHSGRAD Percent of Adult Population with High School Diploma or equivalent
UNEMPL06 Statewide Unemployment Rate, 2006
MEDHHINC Median Household Income, 2006
<200%POV Percent of Population Below 200% of Poverty Level, 2006
<100%POV Percent of Population Below 100% of Poverty Level, 2006
<50%POV Percent of Population Below 50% of Poverty Level, 2006
GINI06 Gini Coefficient of Household Income Inequality, 2006
(higher means more unequal)
Use these data to create any additional variables that you think
may affect murder rates and overall violent crime rates.
You are free to look for other variables on the web and add them into your
analysis if you wish.
Estimate a series of multivariate regression models to determine the set
of explanatory variables that most significantly affect murder rates.
Does jailing more criminals reduce violent crime rates?
Is the death penalty an effective deterrent to murder?
Does the percentage of prisoners on death row affect murder rates?
Does the rate at which states execute their death row inmates affect
murder rates?
Will hiring additional police reduce violent crime rates or murder rates?
What policies would you recommend to reduce murder and overall violent crime rates?
|