Theory review
Every economic choice involves an opportunity cost, which is the value of the best alternative choice foregone. For example, if you spend 2 hours watching TV rather than studying or sleeping, the opportunity cost of that TV-watching is the foregone sleep or learning you didn't get, whichever you value more. A firm's decision to produce widgets instead of tongs, hops or oven-mitts implies an expectation that widgets will be at least as profitable as any other product. Rational behavior involves taking the best opportunity and thus minimizing the cost of the next-best opportunity not taken.
We distinguish between private versus social costs of production. Private costs are borne by the producer directly. Social costs include all private costs as well as external costs borne by the community, e.g., pollution generated by the producer.
Review your basic production theory so you are familiar with the derivation of marginal value product and marginal cost curves, and the graphical linkages between optimal factor use (MVP=MFC) and optimal output (MR=MC). An increase in MFC shifts MC upward. An increase in output price P (MR) shifts MVP upward. Recall that MC is the extra cost of producing an additional unit of output. The area under the MC schedule up to any output level Y represents the cumulative variable cost of producing Y.
Producers always operate where MC and AC are increasing (stage 2). Short-run MC and AC rise steeply as output increases because the producer can only vary some inputs; in the long run the producer can increase output more efficiently by varying all inputs, so MC and AC rise less steeply. Firms produce as long as they can cover variable costs.
LRAC is the lower envelope of all short-run AC curves and determines production scale economies. LRAC may be U-shaped, with a definable minimum optimal scale (MOS) point, or else decline continuously, reflecting ever-increasing returns to scale. The MOS point represents the absolute minimum unit cost of production at the most efficient scale. Competitive firms migrate toward this scale. Oligopolies or natural monopolies may arise if demand is insufficient to support enough MOS firms to keep the market competitive.
LRAC may be "lumpy" if production technologies are not continuously scalable. Technological innovations shift MC and AC downward. Firms invest in R&D expecting that new technologies will reduce future production costs and improve the firm's competitive position in the market. Patents give firms temporary monopolies on new technologies, and thus encourage R&D.
Suppose a firm has multiple factories producing a good. Profit maximization implies the firm will equalize MC across all factories. This is analogous to a utility-maximizing consumer equating MU/P for all consumer goods. A competitive factor market with wage w implies all firms will have the same MVP from that factor.
The firm's MC curve is its supply schedule, showing the quantities it will produce at various price levels. The market supply is the horizontal summation of all individual firms' MC schedules. The area between the market supply schedule and the market price is producer surplus, the revenues producers actually receive above their variable costs (the minimum they would be willing to accept).
Pollution as a production "input"
Assume one firm manufactures good X with production function X = 100LX0.4QX0.4 and a second firm manufactures good Y with production function Y = 100LY0.6QY0.2. These are both Cobb-Douglas production functions with diminishing returns to scale. The L's represent the quantities of labor used by each firm; the Q's represent each firm's pollution emissions. Let w represent the wage rate for labor and c represent the firm's cost of dumping its pollutants into the environment; these are the same for both firms.
Each firm maximizes its profits (TR - TC):
PX100LX0.4QX0.4 - wLX - cQX and PY100LY0.6QY0.2 - wLY - cQY
Taking the partial derivatives with respect to L and Q yields the optimization conditions MVP = MFC, and we can solve these to to determine the marginal condition that defines the optimal input ratio for each firm, MPPL/MPPQ = w/c . Rearranging this expression yields MPPL/w = MPPQ/c, which means that the marginal productivity per dollar spent on each input is the same.
In the case of the Cobb-Douglas models above, the optimality conditions simplify to QX/LX = w/c = 3QY/LY. or wLX = cQX and wLY = 3cQY .
Pollution problems arise when firms don't pay the full cost of Q. If c, the firm's cost of Q, approaches zero, the firm pollutes until MPPQ approaches zero.
Now suppose the government restricts aggregate use of Q to 100 units, and gives each firm an allowance of 50 units. Is this efficient? Although this 50:50 allocation might appear to be fair at first glance, it's not efficient since it won't equate MPP of Q acros the two firms.
If the allowances were transferable between firms, however, we would expect firms to buy or sell allowances until the MVP of Q is the same across all firms. The market price for allowances ($a) would be included in the optimality condition QX/LX = w/(c+a) = 3QY/LY
An equivalent result could be obtained via a tax on use of Q at rate $t, so that both firms adjust L and Q to meet the revised optimality condition QX/LX = w/(c+t) = 3QY/LY. Note that the higher cost of Q due to the pollution tax or cost of allowances implies that both firms will increase L and well as reduce Q. (When firms claim that environmental regulations cause layoffs, they're not always being truthful.)
A note on efficiency
Efficient markets operate where supply (MC) equals demand (marginal WTP). Efficiency implies minimized opportunity costs and maximized economic welfare, measured as the sum of consumer surplus and producer surplus. Efficiency can be measured objectively, since these surpluses are quantifiable. Competitive forces drive market economies toward efficiency. However, the efficiency criterion does not address the issue of how the economic welfare should be distributed in society. The distribution of economic welfare involves more subjective measures of equity or "fairness."
Pareto-efficiency is any situation where no one can be made better
off without someone else being made worse off. Any redistribution
of benefits at all will violate this efficiency criterion. The less
restrictive Hicks-Kaldor efficiency criterion is that a policy is
efficient if the net benefits are positive, i.e., the benefit to the
beneficiaries outweighs the cost to the payers, so the "winners" could
(at least in theory) compensate the "losers" and still be better off.
Remember...the economy is not a zero-sum game!