1 Statement and Significance of the Problem

nication. AKN and HKN, p

1 Statement and Significance of the Problem

1.1Statement of the Problem

This project will develop and evaluate a computer-based system called an Adaptive Knowledge Network (AKN), which is a tool to support and assist groups of researchers who form Human Knowledge Networks (HKN). A unique aspect of the current proposal is a balanced and iterative approach to both networks (AKN and HKN). It considers (1) the design, development, and functioning of the AKN, (2) the social conditions, influences on individuals and other characteristics of the HKN, and (3) outcomes of the interaction between the AKN and HKN.

An AKN is a system that supports people searching networked information resources such as the Internet and other hyper-linked archives. It does this by determining, as best it can, the purpose associated with each search. Initially, this is based on purpose statements given by the searchers, and an optional searcher profile. As the search proceeds, the AKN processes pages examined by the searcher, and solicits judgments from the searcher about the level and nature of the value of the pages with respect to the stated purpose. The AKN then solicits purposes from subsequent searchers and those purposes are matched as closely as possible with the purposes of earlier searches. The judgments by those earlier searchers are used to offer guideposts, suggestions, and landmarks – including specific paths between beginning and ending pages -- to the present searcher. A central technical problem in the design of the AKN is the effective representation of the purposes in a condensed form and the effective matching of the present purpose against the purposes describing earlier stored searchers. The AKN consists of three primary components: the algorithms for identifying and matching text that represents purposes and paths; the content of data representing the purposes and judgments; and the interface through which HKN members interact with the AKN.

An HKN, defined in detail below, is a possibly changing group of people who work in some common problem areas, share some common information resources, and who work toward some common goals. The HKN members may or may not interact directly with each other, but are related through their shared problems, resources, and goals, and resulting knowledge. Thus it is the shared knowledge distributed through the network of relationships that constitutes a HKN as more than a collection of individuals. The ultimate measures of the effectiveness of such an AKN is in its use and evaluation by, and consequences for, the individual users and on their collective human knowledge network (HKN).

This study applies both formative and summative evaluation in an iterative research design to use insights gained from the study of each network to better understand both the AKN and the HKN and their interaction. Considering both networks within a general perspective will provide new insight into how knowledge and distributed intelligence is developed and structured. The entire project comprises learning about and by the AKN, about and by the HKN, and about and by the general process of developing and learning from shared concepts, or “knowledge and distributed intelligence”. Figure One portrays the three-loop structure of the study: inner design loops, intermediate formative evaluation loops, and a final summative evaluation loop, of the AKN, the HKN’s use of the AKN, and salient implications to the HKN.

[paul will enter revised figure]

1.2Significance of the Problem and Proposed Research

(1)Methodological Significance. We will integrate multiple methods across the study phases, including traditional methods such as focus groups, interviews and server log analysis, with some more focused and uniques methods, including: ScreenCam© interviewing, to probe users' mental processes; Grand Tours, to better understand their envrinoments; and semantic nets, to identify meaningful clusters of individual search purposes and HKN conceptual commonality. These methods are applied as part of an overall prgram of iterated formative and summative evaluation. At a higher level, the full results of our analysis of the role of the AKN in supporting the HKN will form the basis for subsequent discussions with key individuals in the HKN, who will provide validation and explication of concepts and principles revealed by our analysis, as well as of our models and insights themselves (labeled “feed forward” in Figure One.) Possible evaluation implications are new methods for design and evaluation of interactive systems, based on non-traditional performance indicators, and metrics that may be broadly applicable to human-centered systems. This is an area of interest where relatively little consensus has been achieved.

(2)Impact on the Human Knowledge Networks Studied. Our whole project will work, among other goals, to help the HKN become more self-aware about its information practices, and their relation to its functional and social goals, as well as provide a tool that should improve HKN performance.

(3) General Impacts. These impacts correspond to several of the goals of the KDI initiative: (a) enhanced understanding of the processes and fundamental results of learning; (b) a more complete understanding of the fundamental processes of distributed intelligence in natural and artificial systems; (c) improved technical and social performance of knowledge generation and use; and (d) increased effectiveness of organizations that work together over distance or time.

2 Goals

2.1Objectives of the Proposed Research

(1) To develop and to assess schemes for using information, provided by intelligent users of a network, in order to develop a self-organizing network that adapts to the distinctive information needs and information practices of its communities of users.

(2) To develop methods for rigorously assessing the effectiveness of those schemes, in real world situations, with naturally occurring networks of users.

(3) To make available the code for the most effective of the schemes, together with the evidence produced by the validation process.

(4) To improve our understanding of how Human Knowledge Networks and Adaptive Knowledge Network technologies interact, and how that interaction affects Knowledge and Distribution Intelligence.

2.2Deliverables

The deliverables of this project will be: (1) technical reports and scholarly papers in the literatures of Information Management, Information Retrieval, Human-Computer Interaction, Computer Science, Operations Research, and Organizational Communication; (2) validated code, in Java and/or its successor languages, and an appropriate DBMS system, such as Sybase, implementing the most effective of the schemes developed; (3) methods and protocols for the systematic evaluation of the impact of an information tool such as this on particular communities such as the ones to be studied here; (4) a collection of screen histories, survey results, transaction logs and transcribed interviews for use by future researchers; and (5) mini-conferences or workshops whose focus, each year, is appropriate to that stage of the project.

2.3Broader Impacts of the Work

The expected benefits of the research are (1) validated schemes (in terms of internal algorithms and external interfaces, both technical and social) for making a Knowledge Network self-organizing or Adaptive; (2) improved understanding of the principles underlying the effectiveness, impact and limitations of Adaptive Knowledge Networks of the type proposed here; (3) Adaptive Knowledge Networks that have been tuned to and validated for several specific scientific communities; and (4) a deeper understanding of the role of shared information and concepts within HKNs. It is also hoped that the project will result in a method for information finding that extends existing methods for collaborative filtering and recommending or annotating so effectively that adoption will start to grow exponentially.

3 Previous Related Work

3.1Work by Others

3.1.1Recommender Systems

There is an enormous body of work on what are variously called “recommender systems”, or automated collaborative filtering systems (Oard, 1997; Resnick & Varian, 1997; Varian, 1997). In general, discussions of these systems, many of which have commercial aspirations and current applications, are not very explicit about how matching is accomplished. It appears, however, that all of them conform more or less to the patent obtained by Hey and Concord (1987). In practice, they rely on variants of linear classifiers to represent users by a vector of real numbers labeled by items (which may be movies, books, email messages, or web pages) and to compute user similarity through various metrics and/or inner product measures on the space of user vectors. In this way, a recommender system such as GroupLens can transform an individual’s preferences for movies, represented as a matrix Pref(person,movie), into a representation of the “person” which serves to identify similar persons. It also identifies movies (as a vector on the base defined by persons) so that the system can find movies similar to a given one. The combination provides a map from movies that I like, through the full preference matrix, to other movies that I might like. The model has some features in common with our approach, but our approach differs in several key ways. Existing models (MovieLens, Amazon.com, numerous others) classify “people”, being able to take for granted that the domain of interest has been specified by the decision to visit the site (i.e., Movies, Books, etc.). Our system links the “profile” or description of the purpose, not only to the individual, but also to the expressed intention (stated in natural language), and to the assessments provided during an encounter with the AKN.

Information in the proposed AKN about the encounter, called an Information Seeking Episode Representation (ISER), has similarities to some other existing systems, particularly Alexa (Alexa, 1998). Alexa draws conclusions from the choices that a user makes, and from the (perceived) dwell time on a page. “We can even learn from the fact that users visit several related sites as they browse the Web, and tend to spend more time at the sites that give them more of what they are looking for.” Of course, even with a downloadableapplication, the system cannot know how much of the total elapsed time has been spent attending to the screen presented. Alexa permits suggestions (“We also learn from users by letting them explicitly make suggestions of "Where to Go Next" from a given Web page.”). This is an effective Web-based realization of the concept in our earlier ANLI (Adaptive Library Network Interface) system, which was developed 7 years ago, with support from the US Department of Education (Kantor, 1993; Kantor & Rice, 1995; Zhao & Kantor, 1993). ANLI was based on the underlying notion that “if you find this book interesting, you will also find X to be interesting”. However, that kind of linking is effective only insofar as the similarity perceived by the recommendor is valid for the recommendee.

This work is also related to “digital library” research, as it represents one way of bringing order to networked archives, essentially helping them to “self-organize” through repeated interactions with their users. The particular combination of representational content (pheromones and episodes) seems unique to this project. We are in touch with other related projects, including the California Digital Library (Smith T, p.c.); research at the CIMIC center of Rutgers (Adam, p.c.); the Rutgers Human-in-the-Loop Digital Library Project (Flanagan, p.c.); work on summarization at Columbia University (Klavans, p.c.;Radev, p.c.) and others.

The AntWorld system, which introduces the notion of a digital information pheromone, to compress information about a searcher's purpose, is currently being developed (Kantor, Melamed, & Boros, 1997) with funding from DARPA. That ongoing project represents a substantial savings or in-kind match to the proposed project, as the fundamental architecture of the AKN is being developed with other funds.

3.1.2Human Networks

Individuals, in normal social contexts, as well as in information-sharing human knowledge networks, do not function in isolation. Rather, they are products of, draw resources from, and contribute to, interactions with others. These interactions involve indirect, redundant, and clustered patterns, rather than simply direct source-receiver relations. Similarly, human knowledge networks involve patterns of direct and indirect relations among information resources (members, Web pages, etc.). Thus, we argue that paths or itinerariesare more important thansimple direct links between first and last pages;contextualized and evaluated search paths will provide more relevant information for HKN users, leading to better understanding of shared HKN concepts.

Social network theory argues that people develop their shared attitudes, norms or behaviors through exposure to proximate others in a social network (Wellman, 1983). Through networks, individuals exchange information, vicariously experience others’ behaviors, legitimate changes and reduce uncertainty about an event, idea or phenomenon (Burt, 1973; Davis, 1966; Rogers & Kincaid, 1981; Tichy & Fombrun, 1979). Diffusion studies have found, for example, that adoption of an innovation by supervisors and co-workers has a stronger influence on one’s own adoption, to the degree that the work group is interlinked (“cohesive”) (O’Keefe, Karnaghan & Rubenstein, 1975). Thus network theory (Rice & Mitchell, 1973) would imply that simply linking two Web pages or items (a formal relation) ignores the structural context determining how they are related (an interactional relation through the shared, distinct, and indirect paths involving other links and references), and thus much of the shared meaning of the references. Shared meanings, across multiple and diverse contexts, depend on local perceptions and interpretations (Wagner, 1994). Shared problem definition and information transfer generate new and increasingly complex knowledge, in response to known problems and discovery or definition of new or neglected ones (Cicourel’s 1990 socially distributed cognition). Hutchins (1991) conceptualized “cognition as embodied within the context in which it occurs...distributed across individuals and the setting” (Rogers, 1993, p. 297). Members and surrounding institutions can draw on this resource, which can then become a "public good". For example, Tsai and Ghoshal (1998) found that structural and relational dimensions of such shared resources influenced resource exchange across 15 organizational units studied, which in turn increased product innovation.

Computer networking may facilitate this distributed cognition by reducing the dependence on being in the same place at the same time. However, with this "disembedding" (Giddens, 1990), development of trust becomes more abstract, depending on information from unknown others, and unknown contexts. Thus the more complete reference linkage information proposed in the AKN should increase perceived trust and sense-making, both by illuminating the diverse contexts underlying associated references, and by making indirect knowledge claims more explicit. Further, Rogers’s (1993) study of coordinating computer-mediated work suggests that simplistic embedding of informational relations in the network may actually reduce important social relations needed to keep others informed of the distribution of information among individuals. Such group self-awareness with regard to information helps avoid conflicts based on acontextual interpretations. In the AKN, making indirect and contextual linkages explicit should improve overall social cognition.

In a related discussion, Wenger (1998) critiqued the notion that learning is an individual process (time delimited, separated from our other activities, and requiring a teacher). He situates learning within the context of other practices, as a fundamentally social process, reflecting people’s participation in multiple “communities of practice”. Much learning between and within communities occurs when boundaries are rich in interactions, whether formal, informal, or through a computer based system such as an AKN that provides associated and linked references. We propose that the AKN, by providing associated references, and paths across those references, will reduce the barriers to entry for newcomers to, and the barriers for shared conceptual understanding by members of, communities of practice. In particular, this will help each HKN to be more aware of the broader context in which its practice is located.

3.2Relation to Prior Work by the PI’s

This project will build upon the DARPA-funded study of Digital Information Pheromones (Kantor, Melamed & Boros, 1997), developing a complementary point of view that shifts attention from individual users of the network to sharing information with each other to improve their own effectiveness. The basic matching engine is built on principles of Logical Analysis of Data, developed by Boros and Hammer, with support from the ONR (Boros & Hammer, 1997).

The formative and summative evaluationis related to a substantial body of work by Rice (Rice & Gattiker, 1999; Williams, Rice & Rogers, 1988) and Covi (1997a, 1997b). Majchrzak, Rice and colleagues are working on mediated support for virtual inter-organizational design teams (Majchrzak, Rice, King, Malhotra & Ba, 1999).Evaluation techniques also build on extensive work in library valuation (Kantor various), including the use of data envelopment analysis (Shim, Kantor). The work on interfaces builds on prior work by Pérez-Carballo (Belkin et al., 1997, 1998).

The PI is a member of the Digital Library Metrics Working Group (Metrics, 1999) which provides a rich source of public and private information about the state of the art in evaluation of information systems. The AKN concept, in a very naïve form, was proposed by Kantor in 1987. In 1991 it formed the core of a project (Kantor & Rice, 1995), supported by the US Department of Education, to develop an Adaptive Network associated with an online library catalog. With the arrival of the world-wide-web it became possible to formulate the concept of digital "information pheromones"-- condensed representations of users’ purposes that can be left like traces in the World Wide web. The “AntWorld” project , now underway, supports finding information in networks by collecting explicit user judgments of the links followed. The proposed AKN system goes beyond that model, with particular attention to the path or order of events as the Information Seeking Episode (ISE) progresses.

4 Components of an Adaptive Knowledge Network

4.1Itineraries and Information Seeking Episodes

The notion of an itinerary or path is key to developing the AKN. An itinerary is the ordered set of links and pages traversed during a single episode (which may extend over several related sessions). During a search, there are two ways that an itinerary can be elaborated. First, the searcher may file, from time to time, a (refined) statement of purpose. Also, extending the AntWorld, the searcher may provide from time to time contextualized judgments of the benefits of following a link. An elaborated itinerary represents a searcher who gave an initial statement of purpose started at some node, following links, and offering judgments or revisions of purpose.