This approach has been pioneered by Logothetis and his colleagues working on the macaque visual system. They trained the monkey to report which of two rivalrous inputs it saw. The experiments are difficult, and elaborate precautions had to be taken to make sure the monkey was not cheating. The fairly similar distribution of switching times strongly suggests that monkeys and humans perceive these bistable visual inputs in the same way.
The first set of experiments (Logothetis and Schall, 1989) studied neurons in cortical area MT (medial temporal, also called V5), since they preferentially respond to movement. The stimuli were vertically drifting horizontal gratings. Only the first response was recorded. Of the relevant neurons, only about 35% were modulated according to the monkey's reported percept. Surprisingly, half of these responded in the opposite direction to the one expected.
The second set of experiments (Leopold and Logothetis, 1996)used stationary gratings. The orientation was chosen in each case to be optimal for the neuron studied, and orthogonal to it in the other eye. They recorded how the neuron fired during several alterations of the reported percept. The neurons were in foveal V1/V2 and in V4. The fraction following the percept in V4 was similar to that in MT, but a rather smaller fraction of V1/V2 neurons followed the percept. Also, here, but not in V4, none of the cells were anticorrelated with the stimulus.
The results of the third set of experiments (Sheinberg and Logothetis, 1997) were especially striking. In this case the visual inputs tried included images of humans, monkeys, apes, wild animals, butterflies, reptiles and various man-made objects. The rivalrous image was usually a sunburst-like pattern (see Figure 2). If a new image was flashed into one eye while the second eye was fixating another pattern, the new stimulus was the one that was always perceived ("flash suppression"). Recordings were made in the upper and lower banks of the superior temporal sulcus (STS) and inferior temporal cortex (IT). Overall, approximately 90% of the recorded neurons in STS and IT were found to reliably predict the perceptual state of the animal. Moreover, many of these neurons responded in an almost all-or-none fashion, firing strongly for one percept, yet only at noise level for the alternative one.
Figure 2: The activity of a single neuron in the superior temporal sulcus (STS) of a macaque monkey in response to different stimuli presented to the two eyes (taken from Sheinberg and Logothetis, 1997). In the upper left panel a sunburst pattern is presented to the right eye without evoking any firing response ("ineffective" stimulus). The same cell will fire vigorously in response to its "effective" stimulus, here the image of a monkey's face (upper right panel). When the monkey is shown the face in one eye for a while, and the sunburst pattern is flashed onto the monitor for the other eye, the monkey signals that it is "seeing" this new pattern and that the stimulus associated with the rivalrous eye is perceptually suppressed ("flash suppression"; lower left panel). At the neuronal level, the cell shuts down in response to the ineffective yet perceptual dominant stimulus following stimulus onset (at the dotted line). Conversely, if the monkey fixates the sunburst pattern for a while, and the image of the face is flashed on, it reports that it perceives the face, and the cell will now fire strongly (lower right panel). Neurons in V4, earlier in the cortical hierarchy, are largely unaffected by perceptual changes during flash suppression.
More recently, Bradley et al. (1997) have studied a different bistable percept in macaque MT, produced by showing the monkey, on a TV screen, the 2D projection of a transparent, rotating cylinder with random dots on it, without providing any stereoscopic disparity information. Human subjects exploit structure-from-motion and see a3D cylinder rotating around its axis. Without further clues, the direction of rotation is ambiguous and observers first report rotation in one direction, a few seconds later, rotation in the other direction, and so on. The trained monkey responds as if it saw the same alteration. In their studies on the monkey, about half the relevant MT neurons Bradley et al. recorded from followed the percept (rather than the "constant" retinal stimulus).
These are all exciting experiments, but they are still in the early stages. Just because a particular neuron follows the percept, it does not automatically imply that its firing is part of the NCC. The NCC neurons may be mainly elsewhere, such as higher up in the visual hierarchy. It is obviously important to discover, for each cortical area, which neurons are following the percept (Crick, 1996). That is, what type of neurons are they, in which cortical layer or sublayer do they lie, in what way do they fire, and, most important of all, where do they project? It is, at the moment, technically difficult to do this, but it is essential to have this knowledge, or it will be almost impossible to understand the neural nature of consciousness.
Electrical Brain Stimulation
An alternate approach, with roots going back to Penfield (1958), involves directly stimulating cortex or related structures in order to evoke a percept or behavioral act. Libet and his colleagues (Libet, 1993) have used this technique to great advantage on the somatosensory system of patients. They established that a stimulus, at or near threshold, delivered through an electrode placed onto the surface of somatosensory cortex or into the ventrobasal thalamus required a minimal stimulus duration (between 0.2-0.5 sec) in order to be consciously perceived. Shorter stimuli were not perceived, even though they could be detected with above-chance probability, using a two-alternative forced choice procedure. In contrast, a skin or peripheral sensory-nerve stimulus of very short duration could be perceived. The difference appears to reside in the amount and type of neurons recruited during peripheral stimulation versus direct central stimulation. Using sensory events as a marker, Libet also established (1993) that events caused by direct cortical stimulation were back-dated to the beginning of the stimulation period.
In a series of classical experiments, Newsome and colleagues (Britten et al., 1992) studied the macaque monkey's performance in a demanding task involving visual motion discrimination. They established a quantitative relationship between the performance of the monkey and the neuronal discharge of neurons in its medial temporal cortex (MT). In 50% of all the recorded cells, the psychometric curve -- based on the behavior of the entire animal -- was statistically indistinguishable from the neurometric curve -- based on the averaged firing rate of a single MT cell. In a second series of experiment, cells in MT were directly stimulated via an extracellular electrode (Salzman et al., 1990; MT cells are arranged in columnar structure for direction of motion). Under these conditions, the performance of the animal shifted in a predictable manner, compatible with the idea that the small brain stimulation caused the firing of enough MT neurons, encoding for motion in a specific direction, to influence the final decision of the animal. It is not clear, however, to what extent visual consciousness for this particular task is present in these highly overtrained monkeys.
The V1 Hypothesis
We have argued (Crick and Koch, 1995a) that one is not directly conscious of the features represented by the neural activity in primary visual cortex. Activity in V1 may be necessary for vivid and veridical visual consciousness (as is activity in the retinae), but we suggest that the firing of none of the neurons in V1 directly correlates with what we consciously see (for a critique of our hypothesis, see Pollen, 1995, and our reply, Crick and Koch, 1995b).
Our reasons are that at each stage in the visual hierarchy the explicit aspects of the representation we have postulated is always recoded. We have also assumed that any neurons expressing an aspect of the NCC must project directly, without recoding, to at least some of the parts of the brain that plan voluntary action -- that is what we have argued seeing is for. We think that these plans are made in some parts of frontal cortex (see below).
The neuroanatomy of the macaque monkey shows that V1 cells do not project directly to any part of frontal cortex (Crick and Koch, 1995a). Nor do they project to the caudate nucleus of the basal ganglia (Saint-Cyr et al., 1990), the intralaminar nuclei of the thalamus (LG Ungerleider, personal communication), the claustrum (Sherk, 1986) nor to the brain stem, with the exception of a small projection from peripheral V1 to the pons (Fries, 1990). It is plausible, but not yet established, that this lack of connectivity is also true for humans.
The strategy to verify or falsify this and related hypotheses is to relate the receptive field properties of individual neurons in V1 or elsewhere to perception in a quantitative manner. If the structure of perception does not map to the receptive field properties of V1 cells, it is unlikely that these neurons directly give rise to consciousness. In the presence of a correlation between perceptual experience and the receptive field properties of one or more groups of V1 cells, it is unclear whether these cells just correlate with consciousness or directly give rise to it. In that case, further experiments need to be carried out to untangle the exact relationship between neurons and perception.
A possible example may make this clearer. It is well known that the color we perceive at one particular visual location is influenced by the wavelengths of the light entering the eye from surrounding regions in the visual field (Land and McCann, 1971; Blackwell and Buchsbaum, 1988). This form of (partial) color constancy is often called the Land effect. It has been shown in the anesthetized monkey (Zeki, 1980, 1983; Schein and Desimone, 1990) that neurons in V4, but not in V1, exhibit the Land effect. As far as we know, the corresponding information is lacking for alert monkeys. If the same results could be obtained in a behaving monkey, it would follow that it would not be directly aware of the "color" neurons in V1.
Some Experimental Support
In the last two years, a number of psychophysical, physiological and imaging studies have provided some support for our hypothesis, although this evidence falls short of proving it (He et al., 1995; Cumming and Parker, 1997; Kolb and Braun, 1995; summarized in Koch and Braun, 1996; but see Morgan et al., 1997). Let us briefly discuss two other cases.
When two isoluminant colors are alternated at frequencies beyond 10 Hz, humans perceive only a single fused color with a minimal sensation of brightness flicker. In spite of the perception of color fusion, color opponent cells in primary visual cortex of two alert macaque monkeys follow high-frequency flicker well above heterochromatic fusion frequencies (Gur and Snodderly, 1997). In other words, neuronal activity in V1 can clearly represent certain retinal stimulation yet is not perceived. This is support
