SLIS Faculty and CourseConnectivity

SLIS Semantic Web Project

Najim Alshammari

Omar Alboulola

Saad Alshetairi

Ryan Goetz

Lebo Molefi

Semantic Web – S636

Dr. Ying Ding

December 15, 2008

1

Group Members

Our group is composed of five students: Najim Alshammari, Omar Alboulola, Saad Alshetairi, Ryan Goetz, and Lebo Molefi.

Omar Alboulola is a second year MIS student with a minor in IT Leadership and Management. He has a bachelor of computer science from King Abdul Aziz University in Saudi Arabia. He has interests in IT, and contributed towards the XMLand ontology development of this project.

Najim Alshammari is a second year MIS student from Saudi Arabia who is hoping to soon begin working toward his PhD in Information Science. He contributed towards the XML and ontology development of this project.

Saad Alshetairi is a second year MIS student and faculty member of Information Science at the Institute of Public Administration in Saudi Arabia. He contributed towards the XML and ontology development of this project.

Ryan Goetz is a second year MIS student. He has a Bachelor of Science degree from Purdue University in clinical laboratory science. His interests are web development and systems analysis. Ryan served as the group facilitator. He also encoded the example queries and logical reasoning for this project.

Lebo Molefi is a third year MIS student. She graduated from Indiana University with a Bachelor of Science in Informatics. Her interests are web development, social networks and “Bathroom Graffiti” as a form of social networking. She contributed towards the ontology development and presentation of the project.

Project Description

Our engineering project is a model based on the Indiana University School of Library and Information Science (SLIS) programs. Our motivating scenario was to create semantic data that could be searched by SLIS students, or eventually “agents”,that would describe the courses and professors of SLIS. For example, a student may want to know want courses are required for a particular degree or what courses a particular professor teaches.

We set out identifying faculty ranks throughout both the Library Science and Information Science Programs. Initially, we even took into consideration that SLIS has programs both in Bloomington and in Indianapolis. This later proved to be cumbersome as we tried to organize our ontology appropriately and in trying to decide how much data we actually want to work with.

We wanted to show these ranks, whether some faculty members have administrator roles, which classes they teach, and on which campus. We basically wanted to show any cross-referencing between the major classes; faculty, administrator, students, degree, courses, campus and such. In the end we narrowed it down to the five main classes; courses, degree, faculty, room, and student.

Project Approach

Our approach was to tackle our project by following the class schedule. Each week as the class progressed, we moved a step forward on our project. The idea was to not only apply what we learn class to our homework alone, but to the project as well. Considering three project members had no previous exposure to XML, we recognized early that this project strategy was needed in order address the relatively steep learning curves faced by these group members. We met approximately every other week to discuss what we had done and what still needed to be done.

XML

We began by creating an XML file to encode the data of our concept, followed by a DTD. Our XML is straightforward with the intention of showing what we want the data to look like. We did not take it further because our understanding is that it would not be used anywhere else later as the course progressed. The XML file consists of declarations, elements, and entities. This file was also well formed and valid.

This step involved list all the faculty members; without their ranks, all the staff members and their titles, the different courses offered this semester in SLIS and the faculty members instructing these courses. See figures 1, 2, and 3 below. Again, XML gave us a clear path of what our data would look like.

Figure 1

Figure 2

Figure 3

DTD

A DTD, document type definition, is said to provide a set of rules that standardize how tags appear in an XML file. Due to the fact that XML alone does not provide a way for us to structure the way the data is displayed, we engaged a DTD.

Ontology

Some weeks into the semester, we began working our Ontology. We met and reviewed our elements from the XML file and discussed how we can make it look and, more importantly, function more efficiently. We first looked for any existing ontologies that we could build upon, instead of builing ours from the ground up. No appropriate ontologies were found, the closest being a K-12th grade education ontology that did not provide us with any aid.

According to one of the class reading by Natalya Noy and Deborah McGuinness, ontology is an explicit formal specification of the terms in the domain and relations among them. It is a language for encoding knowledge on Web pages to make it understandable to electronic agents searching for information. These agents may include programs like Protégé and Jena which we used in this class and furthermore engaged them in our project.

First we selected the Methontology Engineering Method, one of the three major engineering methods we had to choose from for our second homework, as the method to use for our project. We chose this method because not only had we already began part of its steps when we developed the XML and DTD, but also the description and steps to follow were clear and concise. The steps are as follows:

  1. Capture the information and develop a specification document (as done with XML and DTD Schema above).
  2. Interpret it in a set of representation, that is, identify concepts, develop groupings, and describe instances and formulas.
  3. Implement the representation into a formal language (description language).
  4. During each phase, evaluate the model.

1

Step2.1: Data Dictionary

Concept Name / Description / Instances / Attributes / Instance Attributes
Courses / Lists all the courses offered in SLIS and groups them by subject area. / collectionCourse / hasInstructor
isRequiredCourseOf
managementCourse / academicLibraryCourses
libraryManagement
publicLibraryManagement
schoolMedia
organizationRepresentationCourse
programmingCourse
referenceCourse
researchCourse / evaluationOfInformationSystems
evaluationOfResoursesAndServices
introToResearch
Degree / Lists all the degrees offered in SLIS and links them to some of the courses required to graduate. / hasRequiredCourseOf / Dual
MLS
MIS
Faculty / Lists the different faculty ranks in SLIS and links them to the courses they teach. / assistantProfessor
associateProfessor
fullProfessor / isInstructorOf
Room / Identifies in which rooms the courses are taught. / building
roomNumber

Step 2.2: Concepts Classification Tree

Step 2.3: Interpret it in a set of representation; develop groupings, instances and formulas

For this section of the project we engaged Protégé, an ontology editor, because it allowed us to easily describe classes, properties, and individuals, and export the data as needed. First, we created two classes; Bloomington and SLIS, four subclasses; courses, faculty, rooms and staff. The subclasses also have subclasses of their own. See figure 4 below. Originally, we were describing too many concepts with only classes, including concepts that were clearly properties and individuals. Through our iterative meetings and class exercises, we made this realization and corrected it. Figure 6 illustrates our developed properties, with appropriate domains and ranges.

We did not complete our ontology in its entirety, as we did not include all possible individuals. For example, we only included required courses and the semantic web course, along with the corresponding instructors. This fraction of instances still provided nearly 200 triples, along with a graph so immense that it is not included in this report. If the complete SLIS course schedule and faculty roster was included to complete all individuals, the number of triples would certainly be well over one thousand.

Figure 5

Figure 6

Step 3:Implement the representation into a formal language

Protégé provided the ability to easily export our ontology into a variety of formats. We used both N3 and Turtle RDF syntax formats, as seen below. Turtle format was primary used in favor of its shorter length and readability.

Step 4:During each phase, evaluate the model

We constantly evaluated our model throughout the semester in order to investigate different approaches to represent the data. This involves ensuring that appropriate concepts are accounted for, with suitable individuals present in order to provide a “proof of concept”. Beyond this step, examples queries were written in SPARQL in order to provide example extractions of data that is possible, as shown below.

Example queries

What classes does Ying Ding teach?

Who teaches the Evaluation of Information Systems course?

What courses are required for the MIS degree?

Who teaches required MIS courses?

Reasoning

There are several opportunities for logical reasoning in this ontology, in which explicit data does not have to be encoded. For example, if a course is required for either the MIS or MLS degree, it is also required for the dual degree. This allows for the dual degree to not require explicit encoding of required courses.

Other logic could be applied that is slightly beyond the motivating scenario of this ontology if an agent was accessing the data. For example, a student could enter his or her schedule data into an agent, which then queries our data for appropriate courses. Logical inclusions and exclusions could then be made to decisions regarding the schedule (i.e. schedule required courses before non-required, do not schedule courses that are within 15 minutes of each other that meet in buildings greater than 1 mile apart, etc.).

Current Status

Currently, this is still some work left to do in order to fully transition this project into portfolio-worthy work. This includes finishing the Java code to fully interface the data with a Postgresql database and perform reasoning functions. Additionally, transition of the application onto Tomcat via Computer Science’s Silo server in order to create access over the web is in progress. The concepts of these actions are well known throughout all group members, but only technical limitations are left to overcome. This final stage of the project will likely be completed over winter break in order to produce a wholly polished deliverable.

1