Digital Image Basics
(adapted from a lecture written by Betsy Mackenzie)

Analog photographs are created by exposing a film of photo-reactive silver halide crystals to light, developing the negative, and then printing it on photo paper.  The photo-reactivity of silver halide crystals increases with size, which is why large prints from high-speed film tend to be grainier than prints from slower film.  The silver halide crystals are arbitrarily distributed in the emulsion spread on the film, so the graininess of prints is arbitrary too.

In contrast, a digital image is a regular array (aka grid or raster) of pixels that vary in brightness and color.  The pixel values are recorded as numbers.  The regularity of the array makes the image readily recoverable from the numbers, and the numbers are easily edited.  If you zoom in on a digital image, you will typically see the pixels as squares. 

Digital images are basically pixel data.  The amount of data an image contains is determined by its size, resolution and bit depth.

  • Size is simply the height and width of the image, measured in inches, centimeters, points or picas.  (This is not to be confused with the image's file size, measured in KB or MB).
  • Resolution is pixels per linear inch.  If your monitor is set to display 800 x 600 pixels, and your monitor's size dimensions are 10.5 by 8 inches then the resolution in pixels per inch is 75 pixels per inch. If you change your video settings on the same monitor to 1024 by 768 pixels your resolution will become 96 pixels per inch.  Higher resolution means squeezing more pixels into the same space.
  • Bit depth is the number of bits used to store the color information for each pixel.  The pixel value might refer to an indexed color, or it might code a mix of red, green and blue values.  It would specify a shade of gray in a grayscale image, or it might simply indicate if the pixel is black or white.
Brief detour: a binary math primer<>

A bit (short for "binary digit") is the smallest unit of computer information.  The binary number system has only 2 digits (0,1), which reference the on-or-off charge states of the fundamental data elements in computers. 

In base 10, with one decimal place we can count 101 items (numbered 0 to 9); with two decimal places we can count 102 = 100 items (0 to 99); with three decimal places, 103 = 1,000 items, and so on.

In base 2 we can count:
21 = 2 permutations of 1 bit: 0 1
22 = 4 permutations of 2 bits: 00 01 10 11
23 = 8 permutations of 3 bits: 000 001 010 011 100 101 110 111
...and so on. So a one-bit image contains only black or white pixels; a 4-bit image can have 16 colors or shades of gray; an 8-bit image can have 256 colors or shades of gray; and a 24-bit image can mix 256 levels of red x 256 levels of green x 256 levels of blue to yield over 16 million possible colors.   This matches the color sensitivity of normal human vision.  (Human vision is actually much more sensitive to brightness than hue.)   A 30-bit color depth is used for super high-resolution imaging. 

So image size, resolution and bit depth jointly determine how the image looks and how much storage it requires. A simple formula to determine the size of a file (uncompressed) is:  height x width x resolution2 x bit depth = size of image file in bits   (To convert to Kb, there are 8 bits in a byte and 210 = 1024 bytes in a kilobyte, hence 8,192 bits/Kb.)

Bit-depth vs. resolution

Compare the two equal-sized images here. The original of the upper left image has a bit depth of 1 and a resolution of 2000 dots per inch. Each tiny pixel in this image is either white or black. The original  of the upper right image has a bit depth of 8 and a resolution of 200 dots per inch. This image has much coarser pixels, but each pixel can hold any one of 256 shades of gray. The two details showing the guy's left eye and glasses frame illustrate the difference quite clearly. The overall image quality is about the same, but the first image file is over 12 times larger than the second. Without any compression, the 1-bit 2,000-dpi image is (2 x 2 x 20002 x 1)/8192 = 1953 Kb, while the 8-bit 200-dpi image is (2 x 2 x 2002 x 8)/8192 = 156 Kb.

When processing images, keep your ultimate size and resolution objectives in mind. For example, if you are scanning a 3 x 5 inch image for eventual printing as a 6 x 10 hardcopy image on a 300dpi printer, there is no advantage in scanning the source image at more than 600dpi. When you resize the image from 3x5 to 6x10 the resolution will change from 600dpi to 300 dpi. A good rule of thumb to use when deciding at which resolution to scan is

scanner resolution = final dpi x (intended image width/original image width)

Image size and resolution are inversely related: an increase the size implies a decrease in resolution. A common effect of increasing image size is that the pixel artifacts ("jaggies") become more obvious.

Some color theory

Digital images are raster or grid arrays of color values, and are best suited to images such as photographs that are not efficiently represented by vector instructions or ASCII codes.


Each pixel of your computer monitor is a mix of R, G and B primaries.  The color display capabilities of your computer are defined by the bit-depth of your video card and monitor.  A 24-bit display has 256 levels of intensity for each of the three primary colors, and thus supports a potential palette 2563 or 16,777,216 different colors. Color spaces are mathematical representation of color.  There are three primary color-space models you should get familiar with.

The RGB color space is a three dimensional cube with red, green and blue on each of the axes. Colors are coordinates on the cube where black is 0,0,0 and white is 255,255,255. All other colors are represented by their coordinates within the cube. The RGB color space is called additive because you add various intensities of red, green and blue to black to get a color. RGB is the standard color space used in computer displays, scanners and film recorders. This makes sense when you consider a computer image starts with a black screen (no intensity).

Another color space, called CMYK is subtractive: colors are defined by subtracting values of cyan, magenta and yellow (the complements of red, green and blue) from white. The CMYK color space is used in printing. Again, a logical choice since printing usually starts with a white piece of paper. The K in CMYK stands for black.  CMYK printers can substitute cheap black ink for equal proprotions of expensive CMY inks when printing color blends.  

The HIS color space defines colors based on their hue (dominant color), intensity (brightness) and saturation (color purity).  The HIS color space is often represented by a cone. The axis running through the center of the cone is the "greyline" which changes from black (0) at the tip of the cone to white (255) at the base of the cone.  Intensity or brightness is the distance out the greyline.  Each cross-section of the cone is a color wheel of uniform intensity.   Hue is specified as the angular distance from red (0) to green (86) to blue (170) to red again (255) around the color wheel.  Saturation is specified as the relative distance from the grey center of the color wheel (0) where R, G and B are in equal proportions, to the perimeter of the wheel (255) where the color is a mix of at most two primaries.   (The grey line also corresponds to a diagonal through the RGB cube from the bottom foreground corner to the upper back corner.)

You will mostly be using the RGB model to manipulate your images. The HIS model is really an alternative set of dimensions for color space, and we will use an interesting application of this model when we composite hue and intensity bands to create hillshaded terrain images. HIS has other uses: for example, it would be much easier to select the yellow pixels in an image via hue value rather than as the set of all pixels with equal red and green levels.

Digital cameras and scanners

A charged-couple device (CCD) is a silicon semiconductor that acts as a light detector.  When light hits the crystalline silicon in the device, the electrons in the silicon become excited and create an electrical charge proportional to the amount of light (or the number of photons) that the silicon is exposed to.

<>Digital cameras use a grid (2 dimensional array) of crystalline silicon CCD's.  The entire array is exposed at once and each cell in the grid captures a value which becomes a pixel or dot in the digital images. Three filters (red, green and blue) capture the intensity of each band of light creating a composite color image.  The size of the grid (number of megapixels) determines the maximum resolution of the camera. 

The intensity of light hitting the CCD is measured on a scale of 0 to 255.  Three measurements are taken (red, green and blue) for a total of 24 bit color depth.  The picture is then stored on some removable media (e.g., flash card or sD card).

Desktop scanners use a linear array of CCD's that passes under the face-down analog image, recording it as pixel values one line at a time. The resolution of a scanner is measured by the optical resolution of the CCD on the horizontal and by the speed and accuracy of the motor that controls the CCD on the vertical side.  The important number to consider when looking at scanner resolution is the optical resolution; many scanners can double or quadruple their effective resolutions by interpolating pixels between recorded pixel values.  Some scanners use a white light source and three (RGB) filters to capture the image in one pass. Others make three passes, one for each of three primary colors.

Image editing

After acquiring the image you can use a digital image program to resize and crop it, adjust brightness and contrast, and make other edits.  The simplest editor to use is the Paint program that comes with MS-Windows, although its capabilities are limited and its default BMP formats (see below) are not particularly efficient.

You can learn a lot of digital image theory from using a fancier image editor such as Adobe Photoshop.  The GIS lab generally has this installed on its machines.  Photoshop lets you edit multiple layers of an image, and supports a large number of filters and other enhancements.  When using Photoshop you should always edit your images in RGB mode rather than Indexed Color mode.

If you don't have access to Photoshop, try GIMP (Gnu Image Manipulation Program), a freely distributed package that replicates most Photoshop functions (http://www.gimp.org/)

The copland server has the old UNIX xview ("xv") program, which you can access via the Hummingbird Exceed X-emulator.  Its color editing screen makes color theory quite understandable. 

Image formats

Although web download speeds have increased dramatically over the past decade, you should still try to keep file sizes for your web images relatively small.   Download speeds (KB/second) are typically slower for larger image files because there is more packet assembly required.  Web-surfers can be notoriously impatient with slow downloads, and may not wait around for your huge images. 

The Web supports three image file formats: GIF, JPEG and PNG.

The GIF (Graphic Interchange Format) format uses "indexed" 8-bit color and image resolution of  72 dpi.  The 256 colors allowed by the 8 bit color depth are chosen from 2563 = ~16 million possible colors (256 levels of R x 256 levels of G x 256 levels of B), and are indexed in the image's color table.  Then each pixel's byte value simply refers to an index value in the color table.  The GIF format supports animation by timed display of a series of GIFs to create a cartoon like effect.  GIFs can also be interlaced so that when they are downloaded they display every other line and then go back and fill in the missing lines.  This makes the image seem to appear faster.  GIF files can also include specified transparent colors, so that you can blend a GIF's background into the page. 

GIF uses a lossless run-length coding compression algorithm known as LZW (the inventors' initials); this was the basis for Compuserve's patent on the GIF format.  Run-length coding basically abbreviates "00000000001110000000000" to "10x0 3x1 10x0."   This compression strategy is most efficient for images that have long (horizontal) runs of uniform color values, but it is not very efficient for most photographs.

The PNG format was developed as a web-compatible shareware substitute for GIF after Compuserve started charging other software developers licensing fees for its GIF patent. Compuserve's patent has since expired, and GIF is still more widely used. 

The JPEG (Joint Photographic Experts Group) format is typically much more efficient for storing photographic images.  The JPEG coding process sections the image into 8 x 8 blocks of pixels, and calculates cosine transforms that approximate the intensity and hue shifts within each block.  The image file just stores the transform coefficients, not the original pixel values, so the JPEG decoding process produces an image that only approximates the original.  So this is a "lossy" format, but its compression efficiency can be very high.  JPEG's can be saved in various levels of image quality, substituting compression efficiency for additional transform information that retains image quality.   Low-quality JPEG's with maximum compression often exhibit discernible "smears" on the edges of image features, and the 8x8 pixel blocks may be annoyingly obvious.

The quality of an image can really suffer from cumulative information loss if you edit and resave it in JPEG format multiple times.  If you think you may have to re-edit an image, you should keep it in a lossless format such as BMP or even GIF. 

Other image formats are not directly supported by the Web, but are worth being aware of:

BMP (Microsoft Windows Bitmap) formats are recognized by all Windows programs and most other PC applications.  The format supports multiple bit depths: 1-bit, 4-bit, 8-bit and 24-bit.  But since BMP images have no compression their file sizes are often 10+ times as big as equivalent GIF or JPEG images.

TIFF (Tagged Image File Format) was originally created by the Aldus and Microsoft Corporations to store scanned images. There are actually many types of TIFF format; most platforms recognize the standard types.  The TIFF format is versatile but does not have file compression.

Postscript (PS) and Encapsulated Postscript (EPS) are mixed formats that encode both raster and vector graphics. These were developed by Adobe, and are precursors of Adobe's Acrobat format.

The PBM (Portable Bitmap) format was developed by Jeff Poskanzer as a generic intermediary UNIX format for translating images between formats with his Portable Bitmap Tools.  Rather than create N x (N-1) direct format translators for N image formats, the PBM library has 2N translators for 2-step conversions through PBM formats.  To convert a TIFF to a GIF, for example, you would use tifftoppm and then ppmtogif.

Formats for map images

Unlike photographs, maps typically have sharply defined feature boundaries that are vulnerable to "smearing" by JPEG compression.  For this reason it is generally best to save your maps as GIF images for display in your project web pages

There are various ways of getting maps, charts and layouts out of ArcMap and into a web page:

  1. File--Export to create a GIF image directly.  You can control the size of the image by adjusting the resolution.  Set the aspect ratio (height/width) of your map frame so it fits the map subject without excessive white space; the output image file will have the same aspect ratio as the map frame, and you won't need to crop your image afterward. 
  2. Edit--Copy Map to Clipboard copies your map to the Windows clipboard for pasting into any graphic editing package such as Adobe Photoshop or the Windows Paint program.  Edit as needed and save in GIF format.  Paint only lets you save a pasted image in BMP format, but when you reload it, you can save it as GIF or JPEG.
  3. Use Alt-PrintScreen (screen-dump) to copy an entire ArcMap window to the Windows clipboard for pasting into an external editor. 

A word about copyright law

Pulling images off the web and scanning them from hardcopy vsources is easier than ever.  You should be aware of the copyright laws that protect published images.  The Copyright Act of 1978 protects a work of art (including photographs, paintings and illustrations) from the date of its creation through the life of the artist and 50 years after the artist's death.  Works that were created before 1978 fall under the Copyright Act of 1909 and the Automatic Renewal Law of 1992. A copyright lasts for a period of 75 years. Older materials that have lapsed into the public domain can be used freely. Next time you're browsing in a used book store, look for books with old photos or illustrations!

Copyright law does permit limited Fair Use of copyrighted materials. The fair use doctrine lets you quote or distribute parts of copyrighted materials for academic, journalistic or satirical purposes.  Fair Use is not very clearly defined in the law, but the courts are gradually clarifying it.  You do not want to be a defendent in a copyright case. 

The best way to avoid copyright violation is to create your own graphics from scratch. Alternately, there are vast collections of clip-art available at reasonable cost; these collections are not copyright-free, but they are license-free and permit most uses.  Sometimes the clip art license will limit the number of images that can appear in a single publications, or the number of times you can reproduce an image.