Shifting the Focus from Control to Communication

Janvier 1997 - Journées Francophones des Langages Applicatifs - JFLA97

Shifting the focus from control to communication:

The STReams OBjects Environments model

of communicating agents

Stefano A. Cerri

Dipartimento di Scienze dell'Informazione

Università di Milano

20135 MILANO, Italy

Abstract. The paper presents the computational model underlying a new agent communication language. Even if the applicative research context suggesting such an aim is the one traditionally identified by the Artificial Intelligence in Education community, this context, as it is motivated in the paper, is generic enough to represent a wider class of applications, i.e. all those that privilege communication among autonomous human and artificial agents. For instance, those addressed at FIPA (

The model, called STROBE, has been identified and two prototypical languages inspired by the model have been implemented. In order to describe unambiguously the STROBE model we have chosen to use a formal programming language, i.e. Scheme. STROBE shows how generic communication may be described and implemented by means of STReams of pragmatically marked messages to be exchanged by agents represented as OBjects interpreting messages in multiple Environments. The model, therefore, is at the same time a well-defined software architecture and a proposal for a lexicon potentially useful for exchanging efforts in emergent agent technologies.

Recent advancements in the project are producing new modules to be integrated in the STROBE architecture, in particular: a re-implementation of the language for software agents KQML: Knowledge Query and Manipulation Language conceived with the purpose of reducing the primitives to a minimal set and introducing compositionality in order to design new primitives from the basic ones.

An outline of the expected functionality’s of the languages under development may allow to appreciate if and how it may fit the expected ones, i.e. cognitive simplicity for designing and controlling multi-agent generic dialogues, including human and artificial communication facilities.

1.Introduction

Communication among intelligent agents is one of the popular research issues at the moment. In [1] many of the current efforts are reported, together with an extended bibliography. In spite of the impressive, convincing research advancements, at the application level we do not notice yet a major impact. As it was the case for Artificial Intelligence applications, possibly once more the real problem is complexity, not so much for the equipped machine to perform according to well designed software, but rather for the designer and implementers to conceive and realize the suitable application by using available methods and tools.

Within the Artificial Intelligence and Education community [2] we notice a major concern that reaches the same conclusion, i.e. complexity in AI & ED research makes it almost unviable for AI & ED real applications. Even worse: research does not cumulate because neither the description nor the implementation languages are somehow common. Self's proposal was to go back to logic for expressing issues such as mutual beliefs or planning the moves in a dialogue. A hypothesis that fits trends in the AI community for the same foundational purpose, but at the same time is challenged by arguments such as those reported in this paper.

At the moment, AI & ED programs choose more and more object / agent (or actor) - based architectures, considering off the shelf languages as implementation tools (e.g. [3] reviews most of them). What occurs, is that those languages offer primitives and virtual machine models that are not really matching those required by the applications, or either, if they do, that they are too complex to learn and use. Cognitive simplicity in the conception and design of new applications becomes a must for any concrete dissemination of research results as well as for most applicative efforts.

We have chosen cognitive simplicity and composition of primitives as our main purpose. As we were skeptical that abstract logical formalisms would provide for cognitively simple models of dynamic processes, we have instead adopted a representation of communicative processes based on a formal, programming language suitable for process reification and visualization. Among all languages, Scheme [4] was chosen for its abstraction power associated with its formal foundations[1] and for the simplicity of its underlying evaluation mechanism (e.g. the environment model of evaluation). The criticism that Scheme is sequential and therefore unable to model multi-agent interactions is challenged by active research in concurrent languages.

In the following, we will describe:

a. Why educational applications require generic communicative processes, and therefore why advances in educational software are enabled by advances in models and languages supporting communication and, conversely, the requirements of educational dialogues support efforts in the design of new communication languages. The educational metaphor, therefore, pushes technologies of a much wider applicability, such as those claimed to be (almost) mature by industrial initiatives such as FIPA (

b. How a simple agent-to-agent communication model[2] may be described by three powerful Scheme primitives, i.e. STReams, OBjects and Environments.

c. What can be borrowed from a modern artificial agent communication language, i.e. KQML [7], that may be integrated in the STROBE model, but, at the same time, what are its limitations to model dialogues where humans participate, thus why multiple viewpoints (or cognitive environments) are required.

d. Why multiple communicating agents of generic types (humans and artificial agents) require functionality’s typically associated to enhanced operating systems or actors and how we think to model them by means of Scheme extensions. Finally how the high level (Scheme) descriptions and prototypes may be integrated with lower level ones, by means of interpreting / integrating Scheme with Java, thus offering machine independent resource management utilities to be used in the net.

2.Learning as a side effect of communication

One fundamental reflection for anybody interested in Education is that the goal of Education is that learners learn, i.e. change state during / after a communicative process. The process does not per se need to be "educational". That term applies eventually after an evaluation of the new state reached by the learner as a result of communicating. Communication is the real issue for learning and therefore for Education; learning may occur as a side effect (as it was agreed in the workshop reported in [8]). Educational software, then, is nothing else as highly interactive software. Whether or not communication stimulates learning in the learner is not primarily a property of the software managing the communicative process but a relation between the process and its effects on the learner[3].

For instance, in [9] there is an example of learning outcomes from dialogues with a simulator. The author's assertion that "there is an urgent need to further research in this area and it is one of our aims to try to model these different styles computationally" supports our assumption that formal (computer) languages for dialogues are missing. Looking back in the literature (see, e.g. [10]), we notice that the foundations for languages representing human dialogues were laid down years ago, but still the need is not satisfied.

Other authors (e.g.[11] that developed the reflective actor language ReActalk on top of Smalltalk) claim with good reasons that "models developed for agent modeling are of relevance for practical applications, especially for open distributed applications". Among these applications, Intelligent Tutoring Systems play a major role (cf. [12]). We have shown in [3]and 37 where we used the actor languages ABCL/1 and Rosette, that when the chosen actor's granularity fits the components of the problem to be solved, then the conception and implementation of actor-based software may be relatively simple, and so their abstraction and generalization. However, the global, concurrent message exchange control process is not easily conceived. The transition from a sequential, synchronous to a concurrent, asynchronous mental model of computation (control and communication) is a hard process for any human player engaged in the technological arena today. In order to contribute, we have decided to start from understanding and modeling human-system dialogues, thus the processes in the machine that eventually are suitable to control a dialogue with a human.

Those "dialogue control processes (DCPs)" are the ones definitely interesting for understanding and enhancing primarily human-to-system communication, but, as we will see, also generic agent-to-agent communication, up to many-to-many participants. Therefore we need to make DCPs as transparent as possible by choosing an adequate underlying virtual machine model and a visible "granularity" of agents and messages that allows us to reason also in terms of human dialogues. Tradeoffs between controlling joint variables (versus actor's replacements and "pure" functional languages) and the higher level perception of the human agent's exchanges in the dialogues are exactly the issue that we try to address with our research described here in its foundational results.

2.1.Types of communication

There exist many types of communication among humans. The discipline that studies it - pragmatics - has made remarkable advancements (cf. [13] for an extensive presentation). In human-to-system communication, similarly, software layers in the system manage various communicative processes with the user.

Among those types, even if we risk to oversimplify, we will select three types that we assume fit best with past and current human-computer communication systems: information systems, design systems and tutorial systems. Each type is characterized by two properties: the initiative taken (human or computer) and the type of speech acts [14] involved.

Assume that U is the user, and C is the computer, playing the role, on turn, of an Information, a Design or a Tutoring system committed to manage dialogues with the user.

Information systems (when they are mature) consist mainly of communication exchanges where U asks questions to C and C answers to U. During the construction of an Information system, U tells C new information that C stores in its archive. Design exchanges (e.g. programming environments) consist mainly of orders from U to C and the execution of those by C. Finally (strictly) tutoring systems consist of exchanges where C asks U questions, U answers to C and C decides what to do on the basis of U's answer. In that case, C is not interested in knowing what U believes just for updating C's knowledge - as it is the reciprocal case of U asking questions to C in informative exchanges -, but instead for deciding about what initiative to take during the dialogue in order to accomplish essentially an evaluation task leading to the next phase of the conversation. From this simplification we may assume that what we called "strictly tutorial User-Computer exchanges" are basically those where the Computer tests the knowledge of the User. In order to avoid confusion, we may call the systems supporting those exchanges: Testing systems.

From various sources in the literature dedicated to Educational software, we may conclude that Tutoring Systems (and / or Learning Environments) do engage in dialogues with the learner that include Information, Design and Testing phases. Therefore educational applications require managing dialogues with the human user of generic types. Any student may interrupt his teacher to ask for information. Any student wishes to engage in an exercise on a simulated environment where he may play with situations by ordering the simulator to run under his / her control. Information systems and design systems may be considered part of any really effective educational system of the future. What these systems need is controlling dialogues with the user in a fashion that is compatible with the user’s needs, intentions, preconceptions, goals…

Notice that in testing exchanges C takes the initiative, while in Information and Design exchanges U takes the initiative. Human-to-human dialogues are such that any of the two may take the initiative at any time, so that a swap of initiative is a common feature. Assuming to aim at more flexible and powerful artifacts, it is clear that also in human-to-computer dialogue models, informative, design and testing exchanges should be allowed and embedded within each other, at the initiative of either partner. This requirement, if respected by our proposed solutions, will allow to generalize the models to generic agent-to-agent dialogues, where each agent, human or artificial, is associated to a role (caller, called...) in each exchange, while roles may be swapped during the dialogue[4].

Models of agent-to-agent communication require explicit roles, further to an explicit association of agents to physical entities participating to the communicative process. A type and a role associated to each partner, at least, will define then each exchange in communication.

2.2.Communication is not transmission

As one may easily notice, communication is very different from transmission, and therefore we are not just interested here in phenomena at the (low) transmission level (e.g.: active sockets, busy channel, synchronization, queue scheduling) but instead mainly at the high level of active agents (available knowledge, intentions, preconditions, effects, etc.). Certainly (high level) communication between agents (human or artificial) must be founded ultimately on reliable transmission of the messages. But the last is not the major concern; it is just an important enabling factor that we assume to be able to guarantee. For instance, we assume not only that communicative messages include pragmatic aspects (e.g. sender, destinations, intention, role…), but also that these aspects may be used by the receiver to process the message (e.g. to process the queue of incoming messages).

Assuming that messages are correctly transmitted, communication is successful if the rules associated to the pragmatics of the communicative process have been respected. Agent communication languages, such as KQML [7], do address the issues of communication among intelligent information agents, under the hypothesis that these agents are artificial and that they “serve” information to clients asking for it. Their pragmatic level solves most of the transmission and interoperability problems, but lack substantial components in at least two situations. One concerns the case that human agents are part of the multi-agent conversation and the other when the conversation is generic, i.e. includes all three types of exchange cited above (and perhaps other ones, such as those including commitments by a participating agent).

One weak aspect of KQML is related to multiple viewpoint[18], that we address by using cognitive environments as Scheme first class ADT [19]. Another one concerns the choice of the primitives. Regarding this, we are designing primitives that fit specifications deduced from available research on the pragmatic classification of human dialogues (such as the one reported in [9]). A third weakness concerns reflection, as most researchers point out (e.g. [2, 12]).

2.3.Agent communication languages: KQML

The "Knowledge Sharing Effort" community, in particular concerning the language KQML[5], has recently produced significant advancements. This is a language for specifying dialogues among artificial agents by means of primitives that allow queries and directives expressed in various "content languages" (e.g. SQL, Prolog) to be embedded into KQLM messages. These primitives are "performatives", such as evaluate, ask-if, reply, tell, advertise, etc. and the types of the speech acts associated to the performatives are: assertion, query, command "or any other mutually agreed upon speech act". Both choices are quite similar to our ones. The distinguishing property of KQML with respect to traditional languages is the supposed independence of the "pragmatic level" from the language of the embedded "content message". This allows an important level of interoperability. We share also this view.

A KQML application to Authoring Educational software is described in [20], where the concern is mainly software reuse. We are encouraged by this and similar results concerning the productivity of software, but we are not sure that the application of tools developed for a specific context of applications - interoperability among data and knowledge bases for informative purposes - will allow to express easily issues typical of a quite different context, i.e. human-computer generic dialogues. One of those issues consists of user modeling. In [21] we may find an attempt to customize KQML primitives for learner modeling. We will see if and how the results of this attempt will cross / complement our own ones.

We believe that the limitation of KQML with respect to generic dialogues is the assumption that mutual beliefs of agents are correct: in the general case, this assumption may not be true. In our model, we try to model exactly those more general cases of dialogue that occur frequently in educational applications and, more in general, in multi-agent interactions.

3. The STROBE model[6]

In [22] we have outlined a model of communication between agents that is based on the three primitive notions of stream, object and environment. The environment component of the model has been discussed in [19]. We have shown there that the desiderata emerging from the analysis of realistic agent-to-agent dialogues induce two requirements concerning the computational formalism adopted:

Requirement #1: the environment[7] for evaluating procedures and variables is a first class abstract data type.

Requirement #2: multiple environments are simultaneously available within the same agent-object.

Looking at KQML, we have noticed that our first requirement may fit their virtual architecture. Basically, labeling a message with the explicit language in which the message is expressed, is equivalent - in our functional terms - to forcing the partner to use an environment for the evaluation of the query where the global frame binds the language symbols to the evaluation and application functions corresponding to a simulator (or an interpreter, or a compiler including run time support) of the chosen language. The KQML expression

(ask-one:content (price IBM ?price)

:receiver stock-service

:languagemy-prolog-like-query-language

;; (corresponding to theirs LPROLOG)

:ontologyNYSE-TICKS)

may be simulated by our architecture as a request to the receiver agent to use the environment including the definitions of the my-prolog-like-query-language available. Further, the KQML expressions specify also an ontology, i.e. consider a specific environment among many possible ones where terms are defined in a coherent way suitable to represent - independently from the application - a domain of discourse assumed to be valid for the receiver agent, and known to the queerer. The natural computational manner to describe the evaluation of a KQML message like the one above is therefore to send a message with content (price IBM ?price) to agent stock-service where the evaluation environment of the agent is the composition of a global frame containing my-prolog-like-query-language 's bindings and a local frame containing the definitions available in NYSE-TICKS. But what if the receiver's ontology - even if it has the same name - would be different from the queer's?