FREC 682 -- Spatial Analysis
TIGER data import
TIGER (Topologically Integrated Geographic Encoding & Referencing)
files are the most widely-used public GIS data. They support creation of
county-level (1:100,000-scale) base maps of road, rail and water features
(mostly lines) and Census region boundaries (e.g. county, Census tract,
Census block group and Census block) by county. This project has you extract
records from TIGER data and create GRASS vector maps from these, manipulate
vector map toplogy, and extract and map Census data.
You may want to check out the GRASS
tutorial by J. Hinthorne, which explains TIGER data, details methods
of importing these data to GRASS, and explains how to create choropleth
maps of Census data.
Each county's TIGER dataset contains a set of files; the two most important
are the RT1 file, which contains end-nodes and identifiers for ALL the
linework for the county (roads, rail lines, streams, invisible boundaries,
etc.), and the .RT2 file, which contains shape coordinates for the linework
in the .RT1 file. It doesn't make sense to extract all of it at once
into an undifferentiated line file. Rather, you should select subsets
of records from the .RT1 file to import as separate maps or layers.
To extract records representing a particular category of line feature,
refer to the Census Feature Classification Code (CFCC), which occupies
columns 56-58 of the .RT1 file. Column 56 is a letter: A=roads;
B=rail features; H=water features. A1*=interstates; A2*=highways,
A3*=connecting roads, A4*=neighborhood roads; H0*=shorelines, H1*=streams,
H2*=canals, ditches, H3*=lakes, ponds, H4*=reservoirs, H5*=bay,ocean, H7*=invisible
water boundaries--pretty useless.
To extract boundary segments for polygons such as census tracts, select
the records where the polygon ID on the "left" side of the segment (based
on the direction in which the segment was digitized) is different
from the polygon ID on the "right" side of the segment. There are
pairs of fields for left- and right-side 3-digit county FIPS codes (cols
135-137, 138-140), 6-character Census Tract codes (cols 171-176, 177-182,
but we'll only mess with the first four digits of these which identify
the"basic" tract numbers) and 4-character Census Block codes (cols 183-186,
187-190, but we'll only use the first digit of these which identifies the
Block Group).
-
Open a GRASS session in the de_utm83 location. The 1997 updates
of the New Castle County (DE) TIGER Type 1 and Type 2 data files tgr10003.rt1
and tgr10003.rt2 are located in /home/grass.data/census.data/.
If you want, create links to these in your home directory, e.g.,
ln -s /home/grass.data/census.data/tgr10003.rt1 nc_tiger.1
Use AWK to extract the appropriate subsets of records from the
.RT1
file, and v.in.tig.basic to create the following vector maps:
-
interstates (A1*) combined with cloverleafs (A63)
-
primary and secondary roads (A2* and A3*)
-
(optional--this will be a big file!) neighborhood roads (A4*) combined
with unclassified roads (A0*)
-
all rail features (B*)
-
streams (H1*) combined with unclassified water features (mostly streams)
(H00)
-
shorelines (H01 and H02)
-
Use AWK to extract the appropriate boundary records from the .rt1
file and v.in.tig.basic to create the following vector maps:
-
New Castle County boundary
(COUNTYL and COUNTYR left- and right-side 3-digit county FIPS codes
are different. These codes start in columns 135 and 138 of
the TIGER Type 1 file)
-
Census tracts
(TRACTL and TRACTR left- and right-side Census Tract basic codes are
different.
These are 4 digits each, starting in columns 171 and 177 of the TIGER Type
1 file)
-
Census block groups
(TRACTL and TRACTR are different, or else 1st digits of Block Group
codes in columns 183 and 187 of TIGER Type 1 file are different)


-
Use v.to.rast to create a filled (area) raster map of the county.
This step illustrates how you can manually control GRASS's topology-building.
As generated by v.in.tig.basic, the county boundary is lots
of little arcs, each with its own ID in the map's dig_att file. If you
simply v.to.rast the vector boundary line segments directly, you'll
just get a raster outline--not what you want! To get a filled polygon,
you can hack the boundary map's dig_att file, which contains the arc ID's
for all the boundary line segments. The first couple of lines of
the original dig_att file look like this:
L 435289.8280658 4350016.983914 187249930
L 434980.9961061 4349992.139591 187249951
The first field indicates the feature type (L=line; A=area), the next
two fields are easting and northing coordinates (yes, specified to the
micrometer!), and the final field is the arc ID. When you run v.support
to
build the topology, GRASS scans the arc records in the map's dig file and
matches each dig_att record by position to the line arc coincident with
or nearest that point, or the multiple area arc segments that circumscribe
that point. Note that GRASS can commingle lines and areas in the
same map.
You can rename the original dig_att file to something else, then extract
its first line with the UNIX head -1 command, e.g.,
head -1 nc.bndy.orig > nc.bndy to create a single-line dig_att
file for the map. Then use a text editor to edit this new dig_att
file to specify that this is an area ("A") rather than line ("L") feature,
specify any interior point, and pick any ID value you like:
A 434980.9961061 4349992.139591 1
Then run v.support to rebuild the county boundary vector map's
topology. Then when you v.to.rast this map, you should get
a filled raster polygon. Now you know how to hack the topology of
a vector map.
Use d.vect to superimpose interstates, highways and secondary
roads, rail features and all water features, all in different colors. Save
this display as a .GIF file for a Web page presentation of your work.
-
Create copies of your vector Census tract and block group maps.
-
I have already used the GRASS module m.in.stf1.tape and AWK to extract
separate sets of tract and block group records from the STF1A 1990 Census
files for all 3 counties in Delaware. These files, de.stf1a.blkgrp
and de.stf1a.tract, are located in /home/grass.data/census.data.
Use v.apply.census (what the Hinthorne tutorial refers
to as s.in.stf1) to create area vector maps which can then be rasterized
to create thematic (choropleth) maps.
v.apply.census basically rewrites
the dig_att file of your vector map the same way you hacked the
county dig_att file, replacing line ID records with area ID records where
the ID's are values extracted or calculated from STF1A data fields for
each Census reporting area. The interior X-Y coordinates in the new dig_att
file are extracted from the INTPTLAT and INTPTLON fields (the lat-lon coordinates
of an interior point in the tract or block group) in the STF1A data file,
converted to UTM. Note that v.apply.census overwrites the
map's dig_att file; which is why you should run it on copies
of your vector block group and tract maps.
Refer to the Matrix Section of the STF1A Data Dictionary to identify
the appropriate fields and field lengths in the Census data file.
v.apply.census
even
lets you do mathematical combinations of fields, e.g.:
v.apply.census in=de.stf1a.tract out=att f='ncc.popden=(I291/J172)*1000/2.59'
creates a map with population densities as area ID's in the dig_att
file.
(See Note 12 in the STF1A Data Dictionary: areas measures are in thousandths
of a Km, there are 2.59 Km/Sq. mile). Note v.apply.census's
simple field reference system for the Census records: for example, "J172"
refers to 10 columns starting in column 172 (A=1 column, B=2, etc.).
Create the following vector area maps:
-
population density (see v.apply.census example above) in New Castle
County (people/square mile) by Census block group using the de.stf1a.blkgrp
file.
-
median values of owner-occupied housing units (f='ncc.houseval=I7923')
by Census tract using the de.stf1a.tract file.
-
Now use v.to.rast to create raster thematic maps from these vector
area maps.
-
Finally, use r.neighbors or r.mfilter to do a low-pass (neighbor
averaging) filtering to smooth the edges of your population density and
housing values area features. (Densities and housing values don't really
change abruptly at Census tract or block group boundaries, do they?)
Display and save .GIF versions of these two maps with some vector road,
rail and/or water features superimposed for visual reference.


-
Create a brief HTML presentation of your work.
-
Optional (some final hacks you can try): Try creating a GRASS site (point)
file directly from the STF1A file using a script of the form:
#!/bin/bash
# script to extract lat-lon ref'd census data in DECIMAL DEGREES
# from STF1A file into GRASS site map in DD:MM:SS
# NOTE: INTPTLAT begins in col 271 with a "-" char, is 9 char's long;
# INTPTLON begins in col 279 with a "+" char, is 10 char's long
awk '{ mx=100+int(substr($0,272,6)*.00006) ;
sx=100+(substr($0,272,6)*.0036)%60 ;
my=100+int(substr($0,282,6)*.00006) ;
sy=100+(substr($0,282,6)*.0036)%60 }
{print substr($0,270,2) ":" substr(mx,2,7) ":" substr(sx,2,7) "|"
\
substr($0,280,2) ":" substr(my,2,7) ":" substr(sy,2,7) "|"
\
substr($0,72,3) substr($0,52,6) substr($0,51,1)}' < de.stf1a.blkgrp
> \
blkgrp.points
You could display these in a lat-long location, or convert these to
UTM with m.ll2u. Alternately, you could hack up a new dig_att
file by hand this way.