FREC 444--Economics of Environmental Management 
Extensions of CVM; Validity


Survey Procedures

There are many ways we can elicit WTP or WTA measures with surveys.

  1. We might explain the hypothetical change in Q and then ask an open-ended valuation question to elicit maximum WTP or minimum WTA directly, e.g.: "How much would you be willing to pay for the change from Q0 to Q1?"
  2. A variant involves giving respondents a payment card with different dollar amounts, asking them the same question, and having them to indicate the amount nearest their maximum WTP or minimum WTA.
  3. Referenda involve a single yes-or-no question to determine either a lower or upper bound on max WTP/min WTA: "Would you be willing to pay $A for Q1?" A "yes" response implies U[$(M-A),Q1] > U[$M,Q0] (where $M is status quo income), and therefore WTP > $A; a "no" response implies the reverse. This approach minimizes respondent burden but requires much larger respondent samples and more sophisticated statistical procedures.
  4. Bidding games involve iterative yes-or-no questions to search out the respondent's max WTP/min WTA: "Would you be willing to pay $A for Q1?" "$B?" Bidding games are generally conducted in personal interviews (much more expensive than mail surveys), and may exhaust respondent patience as the interviewer tries to narrow the bounds on true max WTP/min WTA.
  5. Binary contingent choice involves asking respondents to choose between alternative scenarios: "Which do you prefer: [$A,Q1] or [$B,Q2]?" The response indicates the direction of the inequality U[$A,Q1] <> U[$B,Q2].  An empirical utility index function is then estimated using discrete choice regression procedures.
  6. Contingent ranking involves asking respondents to rank (indicate their preference ordering for) multiple alternative scenarios.  The rankings index the utility levels of the scenarios.
  7. Contingent rating involves asking respondents to indicate relative preferences for multiple alternative scenarios usiing an ad hoc utility index (e.g., 1 to 10) provided by the researcher.  Ratings or rankings can be regressed against scenario characteristics to obtain coefficients indexing marginal utilities of characteristics.
The validity of contingent valuation

As you might suspect, contingent valuation methods have a number of potential problems that cast doubt on the validity of the valuations they elicit.  These problems are explored in an extensive environmental economics literature.  Contingent valuation was used to determine most of the $2.4 billion damage estimate Exxon paid for the Exxon Valdez oil spill in Alaska, and that litigation sparked some particularly intense debate over the validity of the method.

The validity of contingent valuation depends on thoughtful, honest, well-informed responses, but the hypothetical nature of contingent valuation surveys  makes it extremely difficult to prove the validity of the method.  And it isn't much easier to disprove contingent valuation's validity, since you don't prove anything by "doing it wrong."  Besides the usual sampling and response bias problems that can affect any survey, Mitchell and Carson (1989) have developed an extensive taxonomy of potential problems with the method.  Here is a quick summary:

The NOAA Panel (see below) put the debate over the validity of contingent valuation in the spotlight.  Diamond and Hausman have raised a number of serious criticisms of the method; Hanemann has provided a strong defense of it.

Criticism of contingent valuation

Diamond and Hausman argue that contingent valuation is too vulnerable to "embedding effects," and too sensitive to question order, framing effects, payment vehicle and other biases, to be relied on for policy-making.  Survey respondents simply can't provide meaningful valuations of unfamiliar goods in unrealistic transaction contexts.  They cite several contingent valuation studues that illustrate "embedding effects," where respondent WTP's don't reflect variations in the scope of the amenity.  For example, Desvousges et al. found little difference between WTP's for preventing the deaths of 2,000, 20,000 or 200,000 birds from exposure to waste oil holding ponds; if these respondents were actually valuing birds, theory suggests their WTP's should vary about 100-fold.  If income and substitution effects don't explain the lack of variation in WTP, then what are these respondents really valuing?  Perhaps many respondents are just getting a "warm glow" from expressing their support for a better environment or their disapproval of pollution.  Will WTP to clean up a natural oil seepage be the same as WTP to clean up a man-made spill of the same magnitude?  Probably not.

Contingent valuation questions often elicit extreme values from some respondents, either implausibly high WTP's or zero ("protest") WTP's.   Some contingent valuation researchers "trim" these implausible and zero responses from their data sets.  Diamond and Hausman argue that an arbitrarily-chosen cutoff doesn't necessarily eliminate bogus WTP responses.  They cite Schkade and Payne's "verbal protocol" analysis in which survey respondents were asked to "think out loud" as they formulated valuation responses, noting that respondent comments tended to support the "warm glow" hypothesis.  They cite a study by Samples and Hollyer where respondents' WTP's to protect seals and whales depended on survey question order--which species the respondent valued first.

Respondents' WTP's often don't add up properly.  For example, Schulze et al. found a mean WTP of $72.46 for a complete cleanup of a Superfund site in Montana, versus a mean WTP of $72.02 for only a partial cleanup.  Contingent valuation is particularly questionable when WTP's include altruism or concern over the actions of other people.  Suppose I threaten to beat my dog and you have some WTP for my not doing so; do you have a welfare gain if I don't?  If we count  your psychic harm from seeing birds poisoned by pollution on the nightly news, why can't we count my psychic harm from seeing people people lose their jobs when a polluting factory is forced to close?

Diamond and Hausman reject the idea the "some number is better than no number," arguing that contingent valuations are no more meaningful or reliable than regular opinion polls.

Defense of contingent valuation

Hanemann discusses various procedures for improving the reliability of contingent valuation methods: avoid convenience sampling (e.g., mail or phone surveys, interviewing passersby at the mall); use statistically-based sampling; make the questions realistic and used closed-ended (referenda) questions.  The referendum format is closer to the posted-price shopping US consumers are accustomed to, and eliminates possible incentives for strategically-biased responses.  The analyst should focus attention on median WTP, which is less sensitive to outlier values than mean WTP.

Hanemann concedes that contingent valuation surveys ask inherently difficult questions, may elicit "satisficing" rather than accurate answers, and may be sensitive to nuances in wording.  In response to the charge that "the survey process creates the values," he cites recent psychological theories that "all cognition is a constructive process--people construct their memories, their attitudes, and their judgments," and that "...people are cognitive misers: they tend to resolve problems of reasoning and choice in the simplest way possible....The real issue is not whether preferences are a construct, but whether they are a stable construct."  Various survey-resurvey studies have shown that WTP bids are in fact reasonably consistent over time.

Contingent valuations can be compared with valuations inferred from observed behaviors, and often compare closely.  For example, Bishop and Heberlein shows that mean hypothetical WTP for deer-hunting licenses in Wisconsin ($31) versus a mean actual WTP of $35 elicited from a constructed market experiment.

Hanemann used a 10 percent trim of Desvousges et al.'s data on WTP to save birds from waste oil, and found the remaining data satisfied the scope test.  He argues that sub-additivity, where WTP's for partial improvement sum to more than WTP for the total improvement, may simply reflect substitutabilities between partial improvements.

As noted above, the validity of contingent valuation is almost impossible to prove or disprove.  Each side in this debate rasies valid criticisms of the studies cited by the other, and there are few objective criteria on which to distinguish "good" from "bad" studies.  Adherence to the NOAA Panel's guidelines (below) will help practitioners minimize many of the biases that can affect contingent valuation studies.

Contingent valuation in the policy arena

Since the 1970's contingent valuation has gradually gained a degree of credibility as academic researchers tested the limits of its reliability.  The method has received particularly intense scrutiny since 1989 when a Federal appeals court directed the US Dept. of Interior to revise regulatory provisions of the Superfund law under which the government can sue polluters for damages to natural resources, and instructed the DOI to give equal consideration to lost use and non-use values due to the dumping of hazardous waste.

As DOI was revising its regulations, the Exxon Valdez oil spill in March 1989 suddenly illustrated the potential magnitude of such damages.  Congress passed the Oil Pollution Act (1990), directing the Commerce Dept.'s National Oceanographic and Atmospheric Administration (NOAA) to develop procedures for determining economic damages from oil spills.  NOAA asked Nobel laureates Kenneth Arrow and Robert Solow to chair a panel of experts to evaluate the reliability of the contingent valuation method for evaluating non-use values of environmental amenities.

The ensuing debate pitted long-time practitioners of the method against prominent skeptics.  The previous section summarizes the main arguments pro and con.  The NOAA Panel came to the cautious conclusion that contingent valuation "can produce estimates reliable enought to be the starting point of a juducial process of damage assesment, including lost pasive-use values," and recommended a set of procedural guidelines that should maximize the reliability of such analyses.  A partial list:

A large-scale contingent valuation study by Carson et al. (1992) estimated the non-use environmental damages caused by the Exxon Valdez spill at almost $3 billion.  Although the NOAA Panel favors WTP questions, here the theoretically correct damage measure would be WTA.  Exxon had settled the case out of court the previous year for $1.15 billion.