GIS Analyses of Dr. Snow's Map back to Intro

Snow's map, demonstrating the cholera deaths clustered around the Broad Street well, provided strong evidence in support of his theory that cholera was a water-borne disease. Snow drew Thiessen polygons around the wells, defining straight-line least-distance service areas for each. Each Thiessen polygons is comprised of boundary segments that perpendicularly bisect line segments drawn between the point it contains and adjacent points. A large majority of the cholera deaths fell within the Thiessen polygon surrounding the Broad Street pump, amd a large portion of the remaining deaths were on the Broad Street side of the polygon surrouding the bad-tasting Carnaby Street well.

Then Snow redrew the service area polygons to reflect shortest paths along streets to wells, and an even larger proportion of the cholera deaths fell within the Broad Street polygon or the Broad Street side of the Carnaby Street well's polygon.

You can try replicating Snow's analyses using GIS allocation and density algorithms. The pumps and deaths datapoints were digitized by Rusty Dodson at the National Center for Geographic Information & Analysis (NCGIA) at UC Santa Barbara, using an arbitrary (not geo-referenced) scan of Snow's map. These data locate 578 cholera deaths and the 13 public wells in an arbitrary XY coordinate system.   I edited these plain text files so they are directly importable to Arc9:  deaths.txt and pumps.txt. I edited a high-resolution JPEG-format scan of Snow's map, correcting some broken lines and converting it to a black-and-white TIFF-format image. You can see a GIF-format version of this map here. The following steps prepare the data for analysis with Arc9:

This is the rectified image with the pumps (cyan) and deaths (red) overlaid.  The digitizing of the pumps and deaths was not highly accurate, and the original map is distorted by at least one crease, so I used a 3rd-order rectification, hence the warping of the map. You might prefer to register the pumps and deaths to the image coordinates instead.

wells and cholera deaths

Arc's Spatial Analyst includes a Distance-Allocation algorithm that defines zones of cells (known as Thiessen polygons) closest to each pump.  The allocation map shown here is based on straight-line distances to each pump.  The Thiessen polygon for the Broad Street pump contains 356 (61.6%) of the 578 death records.

thiessen derivation theissen polygons

Here is a kernel density map of the cholera deaths (kernel size = 1.0; cellsize = 0.0025) with density contours overlaid.   The density of cholera deaths derived from this map is 36.8 at the Broad Street pump, versus 2.4 at Carnaby Street, 1.9 at Rupert Street, 0.8 at Marlborough Mews, 0.2 at Bridle Street, 0.1 at Newman Street and zero at all other pumps.  A simple density analysis with no smoothing yielded a similar map with discrete edge segments.

kernel density map

Since a straight-line allocation implies travel through walls and buildings rather than only on streets, I experimented with "cost"-minimizing allocations of deaths to pumps, where the cost of travel across street cells is low and travel across other cells is high.  This basically replicates Snow's travel-distance analysis.

To distinguish street cells from the interior cells of blocks, you can use a paintbucket tool (try MS-Paint, Adobe PhotoShop or another image editor) to spill some other color into the streets. If this color "leaks" into any blocks, you will have to close the break in the block boundary. edited the image to close lines on a number of city blocks so the paintbucket color wouldn't leak into them, and I cleared labels from some streets so they would be "passable." (see edited map)   I then used the paintbucket tool in an image editor to flood all the streets with black.  This altered image was then converted to a binary grid (streets and not streets). 

The allocation of deaths to pumps across a simple cost surface ("cost" of each street cell=1; each other cell=50) assigned 378 (65.4%) of 578 deaths to the Broad Street pump's cost-weighted allocation polygon, and again, a large proportion of the remaining deaths were on the Broad Street side of the Carnaby Street well's supply zone.

cost-weighted allocation of deaths to wells

A more elaborate cost surface could be constructed based on the widths of streets (travel on narrow, less-passable streets costs more per cell than on wide, passable streets), based on straight-line distances from street edges to street centerlines. This would require substantially more editing of the map image, i.e., removal of all the remaining street labels, etc.

back to Intro