Development of a MObile Agent Based WEbSEarch in Ajanta
by
Arvind Prakash
A Plan B report submitted in partial fulfillment of the requirements for the degree of
MS in Computer Science
University of Minnesota
1999
Approved by
Chairperson of Supervisory Committee
Program Authorized
to Offer Degree
Date
University of Minnesota
Abstract
Development of a MObile Agent Based WEbSEarch in Ajanta
by Arvind Prakash
Chairperson of the Supervisory Committee: Dr: Anand Tripathi
Department of Computer Science.
Ajanta is a Java-based system for programming applications using mobile agents over the Internet. This report explains the design and implementation of a middleware system for performing Web Search. We extend the existing File Access system in Ajanta by adding this new primitive. The Web Search system facilitates the user to perform full-text keyword searches on the files in the remote user’s web directory. The Web Search system not only offers many options to narrow your search, it also presents the results in different views which could provide the user with considerable insight on the distribution of the keyword. We also implement another primitive to fetch the status of a remote file. A complete Graphical User Interface(GUI) for the File Access System has also been designed and developed. This GUI has been designed to be generic and easily extendable.
Table of Contents
1. Introduction 1
Overall Goals 1
Motivations 2
Why Agent-based ? 2
Why not use the existent search utilities like Yahoo, Altavista etc ? 2
Salient Features of the WebSearch system 2
2. Background 2
2.1. An overview of Ajanta 2
2.2. Agent, Agent Servers and Itinerary 3
2.3. File Access System 4
2.3.1 File server Architecture 4
2.3.2 File System resource 5
2.3.3 File Server Thread 5
3. An Agent-Based Web Search System 5
3.1 Design Goals and Requirements 5
3.1.1 Information Filtering : 6
3.1.2 Web search options : 6
3.1.3 Presentation Views at the Client side : 7
3.1.4 Security and Privacy 8
3.2 Design and Implementation overview 8
3.2.1 Extensions to Existing File Access system 8
3.2.2 Information Filtering 9
3.2.3 Presentation Views 10
4 GUI for File Access System and Web Search Agent 10
4.1 Functional Requirements 11
4.2 GUI Description and Snapshots 11
4.2.1 File Access System GUI : 11
4.2.2 Web Search GUI : 12
4.3 Implementation Overview 13
5. Conclusions and Future Work 14
References 15
Appendix 16
Results and Presentation View snapshots 16
List of figures
Number Page
Figure 1: The Ajanta Server Architecture 3
Figure 2: File Accss Server Architecture 4
Figure 3: Agent interaction in WebSearch System 6
Figure 4: Information filtering on either side 9
Figure 5: Main GUI for the File Access System 11
Figure 6: GUI for the Transfer Primitive 11
Figure 7: WebSearch choice being made in the main GUI 12
Figure 8: Server Choice drop down box 12
Figure 9: WebSearch GUI 13
Figure 10: Segregated View 16
Figure 11: Combined View 17
Figure 12: Directory Structure View 17
Figure 13: Abstract View 18
Acknowledgments
I thank my advisor Dr. Anand Tripathi for giving me an opportunity to work on this interesting project. This would not have been possible without all his support, encouragement and suggestions. My gratitude for Neeran Karnik, who’s invaluable work in creating this system, provided me with a base to develop my project.
I enjoyed working in the Ajanta group and would like to thank my colleagues Ram Singh and Tanvir Ahmed for their support.
No words will be enough, to thank my parents who inculcated in me, a passion to enjoy everything I do and a never-say-die attitude. I am immensely fortunate to be their son.
Two years of graduate life has been an interesting experience in my life. I am thankful to all the friends I acquired here, for their support on both professional and personal fronts.
Finally, my humble thanks to the gracious Almighty for giving me a chance to be part of his all encompassing, global project.
ii
1. Introduction
Ajanta[1] is a Java-based framework for programming mobile agent based applications on the Internet.. A mobile agent is a program that can represent a user in a network, has the capability to migrate from node to node and to make decisions autonomously on behalf of the user it represents. Its tasks are determined by the agent application, and can range from online shopping to real-time applications. After having accomplished their goals, the agents may either terminate or return to their “source” in order to report the results to the user. Thus, applications can launch mobile agents into a network, allowing them to roam the network, either on a predetermined path, or one that the agents themselves determine based on dynamically gathered inputs. Traditionally, applications in distributed systems have been structured using the client-server paradigm, in which the client and server communicate either through messages or Remote procedure calls(RPC). This model is synchronous in nature, as the client has to wait while the server processes the request. In the Remote Evaluation(REV) model, the client, instead of invoking a procedure, sends the procedure code to the server and requests the server to execute it and return the results. The mobile agent paradigm differs from RPC and REV mainly, as the agent carries the code as well as data with it during migration.
The inherent advantages of this paradigm is the ability to provide increased asynchrony and autonomy in Client-server interactions [1] and in moving client code and computations to the remote server resources. The agent paradigm also provides other benefits, as a client can decompose its tasks among multiple agents to derive parallelism and fault tolerance. This makes the paradigm a virtual gold mine when it comes to applications like information searching, filtering and retrieval, e-commerce etc. on the World Wide Web.
Overall Goals
Information search and filtering applications often download large amounts of information over a network, process it, and generate comparatively small amounts of result data. If we write these applications using mobile agents, the agents can execute on server machines and return the results thus, avoiding network overload. For example, Web-based applications use the stateless HTTP protocol, which often necessitates several network connections for each application-level transaction. If mobile agents are used instead, the client does not have to maintain a network connection while its agents access and process information. The prime goal of this project was thus, to develop an Agent-based Web Search middle ware that would extend the already existing File Access system of Ajanta, as well as, develop a generic GUI for the File Access system. A primitive to return the status of a remote file was also developed. The status comprises of the file properties like, size, last modified date and permissions. The Web Search application would launch an agent to search for a particular keyword(s) on selected servers. This agent will then autonomously perform searches on the web directory of these servers, filter the results as per user directives and bring back the filtered results. The GUI developed for the File Access system is designed to be totally generic, in the sense that new primitives can be added without changes to the main GUI. The status primitive is valuable when it comes to procuring information about files before we plan to fetch them over from a remote site.
Motivations
There were some questions to be answered before we embarked on this project.
Why Agent-based ?
There were two reasons for this
· As mentioned previously, we wanted to exploit the inherent asynchrony provided by the agent-based paradigm. In addition, we wanted to restrict the information search/filtering at the server side as much as possible.
· Another reason, of course, was to exercise the Ajanta capabilities by building on the existing File Access system.
Why not use the existent search utilities like Yahoo, Altavista etc ?
· The goal of the Web search was for specific users and web pages. It allows one to perform exhaustive search of a specific user’s web pages, if that user is running Ajanta’s File Access System server. For example, students at the University of Minnesota can find if the web pages of their instructors have been modified, in the past few hours or not. This kind of search is not possible with the existent search utilities
Salient Features of the WebSearch system
· One agent can visit and search multiple servers for the same keyword(s).
· Fast and efficient search as the filtering of search results is done at the server side.
· Web Search returns the file names as URLs and fetches abstracts of files if specified.
· The results are presented in different view formats, and thus, provide considerable insight on the distribution of the keyword, in the remote user’s directory.
· Security and privacy concerns are addressed.
· It is GUI based and provides the user a easy and handy interface to both enter parameters as well as, display the results.
2. Background
This section gives a brief overview of the Ajanta system. Section 2.2 furnishes some details about the basic components involved in the system. In section 2.3, we describe the existing File Access System in detail.
2.1. An overview of Ajanta
Ajanta provides us a programming environment that facilitates development of applications using the mobile agent methodology. Agents are active mobile objects [2], which encapsulate code and execution context along with data. The Ajanta system is implemented in Java and uses Java’s facilities like object serialization, reflection and remote method invocation (RMI).Two of the main requirements of a mobile agent system are security and robustness. We need to protect the host as well as the agent from being tampered. Also, we need to protect against malicious users and agents and “denial of service” attacks. Robustness is also a main concern, especially in a dynamic, unreliable medium like the Internet. Ajanta satisfies these requirements and more by extending the security model provided in Java.
The first step in creating an agent-based application is to define the services that will provided by host servers to visiting agents. Then the server needs to have appropriate resources to implement these services. More importantly, a generic framework is to be created that allows the server to verify an agent’s identity, create an execution environment, grant an agent restricted access to its local resources plus allow easy agent migration.
2.2. Agent, Agent Servers and Itinerary
In this section, we will give a brief overview of the basic elements of the Ajanta system pertinent to this project.
Figure 1: The Ajanta Server Architecture
Ajanta provides implementations of a generic agent defined by the Agent class and a generic agent server defined by the AgentServer class. These classes can be suitably extended by applications to build their own specific servers and agents. Each agent is bound to its host environment object through an object reference named host. The generic agent server also gives provisions to control a visiting agent’s access to the host at any desired level of access control granularity [4], using the agents credentials. An agent’s credentials is a signed certificate, which comprises of information like names of the agent, owner, etc. Ajanta architecture uses a location-independent global naming scheme called the URN(Universal Resource Name) [3] for referencing or communicating with agents, servers and any other resources. There is a special class called the ItinAgent, which has an itinerary specifying the servers to visit.. The user can specify the task list in a request file. The agent creates the itinerary using this task list file. When the agent is started, the start method of the ItinAgent class finds out the first itinEntry from the itinerary and launches the agent to execute the entry. After the agent is done with a task, it moves on to subsequent tasks in order by looking up the itinerary which it carries along as it migrates. This is where a novel feature has been introduced in Ajanta. Agent itineraries can be considerably influenced by what we call “pattern of migration”. A pattern[6] separates the specification of an agent’s migration path from its computation tasks. There are currently patterns like sequence, split, split-join etc. In the set pattern, the order of the tasks is not important and the agent picks any one of the pending tasks. The selection pattern selects only one entry based on a user-specified directive while the loop pattern loops through all tasks in sequence until a user-defined specification is satisfied. The split pattern results in the creation of child agents for the parallel traversal of it’s contained patterns and the split-join pattern is a specialization of the split pattern, in which the child agents must report their results to some object(mostly the parent) and the parent can wait for one or all of the child agents to return. Sequence is the pattern that is pertinent to this project. This is the simplest of patterns in which, the tasks are executed by the same agent in the order they were submitted(sequential).
2.3. File Access System [1]
The File Access System is a classic application (middle ware) built on the Ajanta framework. It is designed to allow effective sharing of files over the Internet. Each host(user) runs a File Server that is an extension of the agent server, which provides restricted access to portion of the host’s local files. Visiting agents can request files by name(URN), deposit files into the local file system, search files using keywords, etc.
2.3.1 File server Architecture
Figure 2: File Accss Server Architecture
As the File Server extends the basic, agent server class, it inherits basic agent hosting capabilities. In addition, it implements a FileSystem resource whose proxy is given to the agents to be used to access the files. When the file server starts executing, it creates an instance of the FileSystem resource and inserts it into a resource registry. This FileSystem resource is a Java interface and is implemented by the FileSystemImpl class. The file system is actually a specific set of files that the user has made available to agents. The user configures the file server by specifying a directory, which acts as the root of the file system, thus, allowing access to the agents. This is where the index files for the search primitive are also stored.