FREC 480 -- GIS in Natural Resource Management
Census and TIGER Data
This project provides an
introduction to the analysis of US Census Bureau data
mapped with TIGER geodata. The US Constitution requires
the Federal government to conduct a
complete census state populations every decade for purposes of
reapportioning the House of Representatives. Nowadays the
Census Bureau contacts every household by mail, with followup visits by
enumerators. Every household answers the questions on the
"short-form" questionnaire
about occupants' genders, ages, races, etc. A large proportion of
households receive a "long-form" questionnaire containing all the
short-form questions plus additional questions regarding income,
schooling, employment, marital status, etc.
Summary data compiled
from the short-form ("100% count") questions are
called Standard Format 1, or "SF1" data.
Summary
data compiled from the long-form ("Sample") questions are called "SF3"
data.
The Census Bureau summarizes these data at various levels of geographic
detail,
using a hierarchy of geographic area units: States, Counties, Census
Tracts, Census Block Groups and Census Blocks.
The Bureau also publishes GIS data, known as TIGER
(Topologically Integrated Geographic Encoding and Referencing) files,
that are used to create polygon shapefiles of States, Counties,
Tracts, Block Groups and Blocks.
Each of these is identified by a unique FIPS (Federal Information
Processing Standard) code:
- Each state has a 2-digit FIPS ID; Delaware's is 10.
- Each county within a state has a 3-digit FIPS ID, appended
to the 2-digit
state ID. New Castle County, Delaware, has FIPS ID 10003.
- Each Census Tract within a county has a 6-digit ID,
appended to the county code.
The Tract in New Castle County DE that contains the center of
the UD campus has FIPS ID 10003014502.
- Each Block Group within a Tract has a single digit ID
appended to the Tract ID. The center of campus is Block Group
100030145022.
- Each Block within a Block Group is identified by three more
digits appended to the Block Group ID. Morris Library is located
in Block 100030145022003.
The Bureau releases summarized SF3 data down to the Block Group
level, and summarized SF1 data down to the Block level.
State-, county-, tract-, block-group- and block-level Census data tables
extracted
from these data files can be joined to these polygons via their matching
FIPS codes.
The TIGER data also include point, line and polygon features
representing roads, rail lines, streams, water polygons and other
physical features.
ESRI, the publisher of ArcGIS, maintains a website http://arcdata.esri.com/data/tiger2000/tiger_download.cfm
from which you can download TIGER and associated 2000 Census
data. Access this site to download the following shapefiles for
New Castle County, Delaware:
- Census Block Groups 2000: "tgr10003grp00.shp"
- Census Blocks 2000: "tgr10003blk00.shp"
- Census Tracts 2000: "tgr10003trt00.shp"
- County 2000: "tgr10003cty00.shp"
- Line Features -- Hydrography: "tgr10003lkH.shp"
- Line Features -- Rails: "tgr10003lkB.shp"
- Line Features -- Roads: "tgr10003lkA.shp"
- School Districts -- Unified: "tgr10003uni.shp"
- Water Polygons: "tgr10003wat.shp"
- Also include the Census Block Demographics (SF1) --
"tgr10000sf1blk.dbf" -- in your download.
Unzip all of these files into the same project folder on your data
stick. (The download file from the ESRI site contains a separate
zipped directory for each shapefile; but you should extract the contents
of each of these to the same directory.)
Then use ArcCatalog to rename these shapefiles with more meaningful
names ("roads," "streams," etc.).
Next, download the
Census Block Group demographics file
that I created from the SF3 data for New Castle County.
One ply of the Excel worksheet contains the data, the other
defines the variables.
The TIGER shapefiles are in lat-lon decimal degrees,
but they don't have accompanying projection (.prj) files
that specify this, so
Arc won't handle them correctly until
you define the coordinate system for each shapefile.
Use the Arc Toolbox's
Data Management Tools--Projections and Transformations--Define
Projection tool, or edit the shapefile Properties in
Arc Catalog, to define each shapefile's coordinate system as
"Geographic--Spheroid-Based--GRS1980."
Now load the TIGER shapefiles and SF1 Census database file into
your a new ArcMap project.
In the data frame's Properties, set the Coordinate System to
"Projected--State Plane--NAD 1983 (HARN)--Delaware"
This doesn't
alter the unprojected shapefiles; it just displays them
in a State Plane projection
like the map on the left, not like the map on the right.

PART ONE: Exploring TIGER data
-
Create a categorical road map with different line styles for sets of
Census Feature Classification Code (CFCC) categories in the Roads
shapefile: A1x's are
interstate highways; A2x's are main highways; A3x's are connecting roads;
A4x's and higher are neighborhood roads, except A63's which are highway ramps.
Group the A10's as a category, the A20's as a category, the A30's as a
category, the A40's and everything else except A63's as a category, and
the A63's as a category.
Include the water and rail features in your map with appropriate display
styles.
Once you get really nice symbology set up for the roads shapefile,
you can save the shapefile symbology as a Layer file.
(You can even save symbologies for a whole group of shapefiles in a
group layer file.)
-
Join the SF1 demographics Block-level file to the Census Blocks
shapefile attribute table using the common STFID field.
As explained above, each block is identified in the STFID field by its
hierarchical
15-digit FIPS code
SSCCCTTTTTTBBBB where
SS is the state,
CCC is the county,
TTTTTT is the tract and
BBBB is the block ID. (Block Groups within each
Tract are
identified by the first digit of the block ID.)
Likewise, join the Block-Group-level SF3 data for New Castle
County to your block group shapefile
using the 12-digit block group ID's (SSCCCTTTTTTB)
Create "AREA" fields (data type should be "Double") in the
block attribute table and the
block group attribute table. Then right-click on the
field headings and use "Calculate
Geometry" to
calculate the polygon areas in square meters or square kilometers
(1 sq. KM =
1,000,000 sq. M.) Note that if you calculate
areas from lat-lon units you get bogus
measures based on "square degrees."
Now create cool-to-hot thematic maps of 2000 population density
by Census Block and by Block Group for the county using whatever
classification scheme works best.
- Download the EPA's point shapefile of toxic
waste sites. The "TYPE" field near the end of the attribute table
identifies the "Superfund" toxic waste sites. I included a field of ones
to use in creating density maps of these.
Use the Spatial Analyst
"Density" or "Interpolate to Raster--Kriging" tool to create separate
density maps of the Superfund sites and all other EPA sites using a search
radius of 5000 meters. Use the Raster Calculator to create a weighted-sum
exposure risk map, adding 5 times the Superfund density plus the other EPA
site density. Does there appear to be a spatial correlation between
exposure risk and poverty rates? PART TWO: Spatial Statistics
- Open a blank Arc session and download a zipped shapefile of New Castle County block
groups with a different attribute table. Add this shapefile to the
dataframe. This shapefile is in DE State Plane NAD 1983 (HARN)
coordinates.
Under Tools--Extensions, activate the Geostatistical
Analyst extension. Add the Geostatistical Analyst toolbar. Use the
Geostatistical Analyst's "Explore Data" tools to examine the spatial
clustering of poverty and racial groups in the county.
(If you get an error telling you to increase the maximum number of
features a geostatistics tool can analyze,
increase the maximum from 300 to 400: from "My Computer"
open C:--Program Files--ArcGIS--Utilities and run the AdvancedArcMapSettings
utility; make the change in the "Geostatistics" tab.)
- Compare histograms of PCTBLACK, PCTWHITE and ln_B_W_
(the natural log of the ratio PCTBLACK/PCTWHITE).
What do the skewness (asymmetry; a normal distribution has zero skewness)
and kurtosis (fatness of the tails
of the distribution; a normal distribution has kurtosis=2) suggest about
the clustering of blacks and whites?
- Compare Normal QQ plots of PCTBLACK, PCTWHITE, ln_B_W_,
MEDHHINC and PCTPOV. How do these accord with your analyses of the
histograms?
- Create a semi-variogram/covariance cloud of ln_B_W_.
These graphs plot the covariance of each pair of block groups
against the distance between the pair.
Click the Covariance tab to see the spatial covariance cloud, which
should look like a Nike swoosh.
What does
this suggest about spatial clustering of blacks and whites in the
county? If you suspected there was a particular directionality
to the covariance, you could click the "Show search direction" box
and examine covariance clouds in different search directions.
- In the Geostatistical Wizard toolset, create an Inverse
Distance Weighted interpolation of ln_B_W_.
Then create an ordinary Kriging of ln_B_W_. How does the IDW
interpolation compare to the kriging prediction map?
- Now execute a Cokriging of ln_B_W_ (Dataset 1) with
PCTPOV (Dataset 2). How does the co-kriging prediction map
for ln_B_W_ compare with the ordinary kriging prediction map?
(Right-click on the co-kriging layer in the table of contents
and click "Compare...")
 
-
In Arc Toolbox's Spatial Statistics--Analyzing Patterns tools,
use the High/Low Clustering (Getis-Ord) tool to determine the probability
that the spatial distributions of high and/or low values of PCTWHITE,
PCTBLACK and PCTPOV are merely random.
-
Use the Spatial Autocorrelation (Moran's I) tool to compare the
spatial autocorrelation of PCTBLACK, PCTWHITE and PCTPOV.
-
Under the Mapping Clusters tools, run the Cluster and Outlier
and the Hot Spot Analyses on ln_B_W_.
-
Under the Modeling Spatial Relationshiips tools, run an Ordinary
Least Squares regression of ln_B_W_ (dependent variable) against
PCTPOV, i.e. ln_B_W_ = C0 + C1×PCTPOV + e. This procedure
models
the relationship between race and
poverty with intercept and slope coefficient
estimates that do not vary over space.
Does the map of residuals e from this regression
exhibit significant clustering?
-
Now run a Geographically Weighted Regression of ln_B_W_ against
PCTPOV. This procedure allows the regression coefficients
to vary over space. The map of
residuals from this regression should be much more random.
Switch the symbology to display the C1 slope coefficient.
Notice how the relationship between race and poverty is negative
around Newark but strongly positive north of Wilmington. Explain.
"Do the chickens have large talons?"
|