It would seem as if we would perceive faces as we do other objects; however, the mind-brain uses a different form of representation with which to carry out face recognition. This is supported by research on certain cognitive disorders in which subjects' ability to recognize faces is unaffected, while their object recognition is impaired. [And there is the converse: face recognition impaired but object recognition OK.] This evidence reinforces the idea that there are distinct and dedicated neural processing units for object and face recognition. It seems reasonable that such a distinction should occur in our processing system because of our nature as socially dependent animals. It is evolutionarily advantageous for us to be able to distinguish between members of our group (especially friends vs. foes). [But the issue remains: why would faces be a means of doing this?]
One possible explanation of face processing is that the mind-brain carries "snapshot pictures" of known faces. [Intuitively] This seems to be the case, for we can easily conjure up a picture-like image of someone's face. Such an explanation, though intuitively plausible, is not likely to be correct. Although we encounter faces under viewpoint variant situations, we make an invariant response in the recognition of a particular face. By the snapshot picture explanation, this would require that the mind-brain store thousands upon thousands of pictures of a face from each differing angle. And, if we encountered a known face in a new unfamiliar angle, we would not be able to recognize it. Thus, a robust theory of face recognition must provide a viewpoint invariant means of recognition.
More broadly, the idea that face representation is a set of analogue representations of every face encountered runs into the same problems as similar explanations posed for, say, object knowledge: is your face knowledge more like a cassette tape or a CD? Do you store faithful pictures of faces or do you regenerate faces each time from abstract specifications?In order to explain the current widely accepted view, we will draw upon Marr's model of visual processing. During lower level visual processing, there is no distinction between object and face computation. All visual information goes through the first three levels of the Marr model, passing through the 1D (edge detection), 2D (surface reconstruction), and 2-1/2D (recognition of surface discontinuities) levels. (This claim is supported by cases in which problems in the 2-1/2D affect both face and object recognition.) At this point, however, a different processing system is invoked for face recognition. While object representations are then further abstracted into geons at the 3D level, face processing at the 3D level uses a different form of representation.
Instead of geons, face representations are composed of five "complexes." These five complexes that make up facial representation are (1) forehead/hairline, (2) eyes, (3) nose, (4) mouth, and (5) chin. Within these complexes, there seems to be a hierarchy of importance, with the most emphasis being placed upon the eyes. This again seems reasonable because the eyes clearly convey important emotional information (fear, anger, happiness, etc.).
These complexes are not local atomic units, but composites of information about the specific facial feature, its surrounding area, and certain relationships. [Thus, the elements of face knowledge are really different from the elements of object knowledge.] For example, the eye complex includes, in addition to the shape and structure of the eyes, information about the distance between the eyes. In addition, the complexes provide the major landmarks and boundary conditions of the face. The nose, for instance, is an important landmark, while the hairline and chin provide boundary information. An abstract representation of a whole face is thus composed of the information present in these five complexes. [Note how these complexes define the face as an abstract unit, from top (hairline) to bottom (chin), with major landmarks in between.]
An important constraint on face recognition involves the orientation of the input. Face processing has a preferred vertical orientation in dealing with facial information. In face recognition experiments subjects have much greater difficulty in recognizing inverted faces. This notion of verticality applies to the complexes as well, making a face unrecognizable when certain complexes (such as eyes or mouth) are inverted. (Cf. the famous Maggie Thatcher inversion in the psychological literature!)
Face processing, therefore, is separate from object processing (after the 2-1/2D level). Face recognition uses five composite complexes from which an abstract representation of a face can be constructed, subject to certain constraints such as verticality. Experiments and certain cognitive disorders support this view. In addition, this account provides an explanation of the ability of our face recognition system to make an invariant response under varying viewpoints.
Note how this view of face knowledge invokes the earlier issue of the nature of mental categories and the tension between categorical and prototypical conceptual organization. The atomic complexes of face knowledge provide the basis for deterministic face categories; but faces constructed according to these atoms could be averaged over their instances to give the prototypical face. That is, the atomic, categorical face is not the same as, but not incompatible with, the prototypical, statistically determined face. Is the same true for other mental categories?