Sydney symposium paper DRAFT 01-03-01 14

Overlapping Mental Representations of Self and Group: Evidence and Implications

Eliot R. Smith

Purdue University (West Lafayette, Indiana USA)

DRAFT Talk for Sydney symposium March 2001

This paper will span several apparently quite disparate levels of analysis:

v detailed models of mental process and representation (specifically, connectionist models)

v psychological phenomena (specifically, patterns of response times for self-descriptiveness judgments)

v social and intergroup phenomena (specifically, identification with a social group and its consequences)

v with some speculative links to evolution, self-regulation, close relationships, and so on.

I hope to elucidate some connections among these levels, making the case that they are highly relevant to each other. In a nutshell, the basic argument of the paper is this. Social identity theory and self-categorization theory, currently the most well-developed approaches to intergroup phenomena, rest on the assumption that intergroup behavior depends crucially on the psychological self. They postulate that the psychological self is flexible, and depending on the context may extend beyond the isolated individual to incorporate relationships with other people and identification with social groups. However, the underlying mechanisms of this "incorporation" or "overlap" have not been theoretically explicated in detail. The connectionist approach to mental representation can fill this gap. I will present a connectionist framework that can account for the flexibility and context sensitivity of the self (and other mental representations) and in turn, the ways that a salient group membership and identification can alter the psychological representation of the self.

Connectionism

In cognitive and developmental psychology (McClelland & Rumelhart, 1986; Elman et al., 1996) and now increasingly in social psychology (Read & Miller, 1998), connectionist models are being explored and developed as a fundamental alternative to more traditional types of models of mental representation. This represents a true shift in the guiding metaphor for understanding cognition, from computation to biology. The computational metaphor dominated since mid-century, aided by the development of modern computers and the ascendancy of the cognitive perspective within psychology. But with an increasing recognition that cognition evolved for the control of adaptive behavior, a more biologically based approach to thinking about cognition has taken hold (Brooks, 1991; Smith & Semin, 2000). Connectionist models are a part of this newer thinking, insofar as they are biologically inspired (although in most cases far from faithful to the detailed workings of biological neurons).

Connectionist models or more specifically distributed connectionist models are most easily described in terms of their contrasts with more familiar types of models (see Smith, 1998 for a more detailed presentation). Symbolic models include all those most familiar in social psychological theory (schemas, associative networks, Storage Bins, and the like). Localist connectionist (parallel constraint satisfaction) models constituted the first generation of connectionist models applied in social psychology (e.g., Shultz & Lepper, 1996; Kunda & Thagard, 1996). In both of these types of models, complete representations are constructed from individually semantically meaningful nodes by outside processes. This illustration would be typical of such a model, presenting four nodes representing, say, a person and three behaviors, connected with linkages that bind them together into a complete symbolic structure. The nodes themselves are passive bearers of information (like words on a page of text) rather than active processing units.

In contrast, in distributed connectionist models any meaningful representation is a pattern of activation across a number of nodes. No single node has any fixed meaning (unlike the P or person node in the above illustration). Conversely, the same units participate in many different patterns that represent different objects or concepts (a property termed superposition). A familiar analogy is the individual pixels on a TV screen. No one pixel has any fixed meaning, but by taking on different patterns of color and illumination, the set of pixels can collectively represent a very large number of different images. Nodes are active processing units, rather than passive information bearers that have to be assembled by external procedures. The nodes are richly interconnected, as shown in this illustration, and send signals to each other over these connections. The connections are relatively permanent (although their strengths may change with time as described below), rather than being dynamically created like the structure shown in the first figure.

Connectionist nodes not only are vehicles for representation, but also actively process information. Specifically, each unit has a property termed activation that is assumed to change from moment to moment. Each unit's activation level is a function of its own previous activation as well as the inputs the unit receives over incoming connections. In turn, each unit sends output (a function of its activation) as a signal to other units, over its outgoing connections. These connections are weighted (the signal sent over the connection is multiplied by a numerical weight) and unidirectional (signals pass in only one direction, conventionally indicated by an arrowhead, over each connection). However, a pair of connections may go in opposite directions between two units so that each can influence the other. Inputs arriving at a unit from separate units are simply summed.

Letting a symbolize a unit's activation and w a weight, for example, the net input to unit 1 is w12a2 + w13a3. The activation of unit 1 at the next time point will be a function of that input value.

In contrast to the rapidly changing activation level of each unit, weights on connections change only slowly, through the operation of the network's learning rule. These weights therefore are the repository of the network's long-term knowledge.

Autoassociative Connectionist Memory

One specific type of connectionist network, an autoassociative memory, will be the focus of our discussion. A detailed discussion of the workings of an autoassociative memory can be found in Smith and DeCoster (1998). The important point for the present is the way such a network can behave. The figure shows the basic idea.

Operation of the network can conveniently be discussed as involving two phases. During the learning phase, many patterns are input to the units, many times each. After each pattern is presented, a specific learning rule is applied that slightly modifies all the connection weights. Weights are changed upwards between pairs of units that are concurrently active in the given pattern, and downwards between pairs of units that have different activation levels. Thus, connections are strengthened between units that tend to be co-active because they are included in similar patterns. (This is termed the Hebb rule for unsupervised learning.) Following an adequate amount of learning, the network is able to carry out pattern reconstruction. Suppose that a pattern similar to a previously seen pattern, or a subset of the units that make up a previously seen pattern, is input to the network. Activation flowing from the nodes included in that subset will flow along the strengthened connections, and will ultimately activate the rest of the learned pattern. This amounts to reconstruction (not "retrieval") of the whole pattern from a partial or erroneous cue.

There are three key properties of the type of connectionist memory just described (Clark, 1993).

1. Incremental learning, not discrete representation construction

In symbolic and localist connectionist models, meaningful structures have to be assembled at a discrete point in time from their component units. Initial learning (the creation of a representation) and change (modification of existing representation) are thus conceptually quite different. In distributed connectionist models, in contrast, representations are incrementally built up through experience, as the network processes multiple input patterns. There is no discrete point at which a representation is constructed, and no distinction between learning and change (representation change is just more learning).

2. Distinct format of currently active representation vs. latent knowledge

In symbolic and localist connectionist models, active and inactive representations have the same format and structure (i.e., a set of nodes connected by links). Retrieval amounts to searching for and finding the structure and activating it; the metaphor is a warehouse or file cabinet in which many representations sit passively until they are pulled out. Moreover, individual representations are discrete and distinct (like separate sheets of paper in the file cabinet), so one can be activated or changed without affecting any others. In distributed connectionist models, in contrast, the currently active representation is a pattern of unit activations. These activations change rapidly moment to moment. Latent or inactive knowledge is not represented in the same format, but is implicit in a set of network connections that allow learned patterns to be reconstructed from input cues. In this latent form, all representations are superposed in a single set of connection weights. Changing one representation will (in general) change all, although the amount of interference may be small depending on specific details such as the similarity of the patterns, the learning rule used, and the size of the network.

It may be difficult to envisage the access of information from memory in any way other than with the familiar metaphor of storage and retrieval. But the notion of reconstruction is very intuitive in other psychological contexts, such as the experience of affect. It makes no sense to ask questions like: where is my happiness (anger, surprise, etc.) stored when I am not feeling it? Obviously these experiences are not "stored" anywhere; instead, our minds have the ability to reproduce or reconstruct these states under appropriate circumstances (i.e., given appropriate cues). Consider knowledge representations in the same light. The representation of another person, the self, or a social group is not "stored" anywhere when it is not currently active; instead, our minds can reconstruct these representations when given appropriate cues.

3. Context sensitivity and flexibility of representations

In symbolic and localist connectionist models, a concept is represented by a structure (see figure 1) that essentially remains the same whether it is inactive (stored away in memory) or active (current focus of attention). If a version of a concept that is tuned to a specific context is needed, it must be constructed on line from two or more parent concepts (Kunda et al., 1990). Research suggests, however, that conceptual knowledge is pervasively context sensitive (Yeh & Barsalou, 2000). For example, in a sentence like "the bird walked across the barnyard," the concept of "bird" centers around chicken and similar exemplars, while in a suburban backyard context, "bird" would more likely mean something like sparrow (Barsalou, 1987). In distributed connectionist models, in contrast, because representations are reconstructed rather than retrieved, this type of flexibility and context sensitivity is automatically present. Cues that are present at the moment of reconstruction can tune and bias the pattern that is ultimately constructed, to a contextually appropriate version of the concept (Clark, 1993).

As a demonstration of this point, Rumelhart, Smolensky, et al. (1986) trained a network with the typical features of various types of rooms (living room, bedroom, and so forth). Presented with cues that clearly related to only one of the known room types (such as a bed) the network reactivated the entire known bedroom pattern. More important, when cues that typically relate to different rooms were presented (e.g., bed and sofa), the network did not decide arbitrarily between bedroom and living room, nor did it break down and give an error message about incompatible inputs. Instead, it combined compatible elements of the two knowledge structures to produce a concept of a large, fancy bedroom (complete with floor lamp and fireplace). Memories virtually always involve the combination of multiple knowledge structures in the way suggested by this example. For instance, retrieval of an autobiographical memory may be influenced by general knowledge as well as by traces laid down on a specific occasion (Ross, 1989; Loftus, 1979). Or perceptions and reactions to a person who is a member of multiple categories, such as a Pakistani engineer, may be influenced by knowledge relating to all of the categories.

This third property is the key focus in this paper. The property states that connectionist representations involve the on-line reconstruction of concepts (represented as occurrent activation patterns), based on an underlying array of long-term representational resources (connection weights). Applied to the self as a concept, this property is very much in line with the statement by Turner et al. (1994) that "the concept of the self as a separate mental structure does not seem necessary, because we can assume that any and all cognitive resources--long-term knowledge, implicit theories,... and so forth--are recruited, used, and deployed when necessary" to construct a self-representation. With that, we turn to a consideration of the flexible and context-sensitive social self, from an underlying connectionist viewpoint.

Reconstruction of the Social Self

From a connectionist viewpoint as just described, all conceptual knowledge is actively reconstructed, rather than passively retrieved. In particular, this applies to self-knowledge or the representation of the self (Markus & Wurf, 1987). The central insight of the social identity and self-categorization theoretical tradition is that knowledge about one's social groups is used in the construction of the self. This is the basis for self-stereotyping, people's tendency to conform to group norms when group membership is salient (Spears et al., 1997; Turner et al., 1987). The reverse is certainly true as well (although it has received less conceptual emphasis): knowledge about the self is used in the construction of social group representations, in a kind of social projection. This overall pattern has been described as involving a kind of "overlap" of self and group representations, but this term can be misleading. "Overlap" -- as portrayed in Aron et al's (1992) IOS scale, for example -- suggests that two independently existing objects share some parts. Instead, it seems better to stick with the ideas of superposition and reconstruction. Self and group representations are both constructed as needed, from a common pool of underlying knowledge that affects them both -- just as the McClelland and Rumelhart room schema network can use its basic knowledge about what features go with what to reconstruct different types of bedrooms, living rooms, etc.

What evidence do we have for the notion that self and group representations draw on common long-term knowledge resources (as stated by Turner et al., 1994)? I have developed such evidence in several studies using a response time paradigm. These studies show that people reporting on their own traits or other attributes are able to do so faster for self-attributes that match those of the person's in-group--even though the in-group is not explicitly part of the person's task. As an overview, the logic of the response time method is that