Page | 1
DPS DISSERTATION DATABASE
Andriyan Leo, Fabio Garcia, and Xu Yang
Page | 1
Abstract -- The project that we were assigned is called “Analysis of DPS Dissertations”. This project is to create a database for the Doctor of Professional Studies (DPS) dissertations. There are 67 dissertations in total and each dissertation is in PDF format. Currently, Pace University does not have a functional database to keep track of each individual dissertation. As a result, there are information issues when more dissertations are created. For example: it is difficult to know which professor was assigned as an advisors to assist the author in completing his or her dissertation, and also it is almost impossible to know the average number of pages in each DPS dissertations without having a proper database setting.
Due to the aforementioned problem, we decided to develop a database with a web interface that will lessen or eliminate the issues encountered in the DPS Department. This database project will play a key role in enhancing the operations of getting relevant information for DPS dissertations and to better serve the information system needs of the Pace University. This new database for DPS Dissertation System can substantially increase the efficiency of operations, time, and increase productivity.
I. PROJECT OVEVIEW
Team 6 will organize the 67 DPS dissertations and provide the department of Seidenberg a database with a web interface. The database will be used to organize the dissertation in order to support processes that require the retrieval of DPS information such as author or committee member information via the web interface. The system will be developed using PHP for the development dynamic web pages and MySQL for database. Below summarizes some of the functions of the database:
· Ascertain information on individual dissertations and compute averages such as average number of pages.
· Getting a table of subject categories with dissertation counts, methods used, and average years-to-completion
· Getting a table of dissertation advisors with dissertation counts, breakdown of subject categories and methods used, and average years-to-completion (this will provide information on the subject categories and methods supported by faculty members)
· table of methods used with average years-to-completion
II. DATABASE DESIGN
A. Database Model Specifications
The objective of the database is to manage as much information that can be gathered from the 67 dissertations. We are splitting the database into 3 tables: Dissertations, Committeemembers and Externalpublications. All the tables contain a primary key and some foreign keys. The relationships between the tables are many to many relationships. Figure 1 demonstrates the ER diagram for DPS dissertation database
Figure 1. ERD of DPS Dissertation Database
Page | 1
II. DATABASE STRUCTURE
Figure A. External Publication Table Details
Figure B. Dissertation Table Details
Figure C. Committee Members Table Details
III. DATABASE AND TABLES CREATIONS
A. Database Installation
1. If you haven't done so already, download the free MySQL Server Community Edition otherwise go to next section “Table and Data Creation Scripts”. (You can download the software from the MySQL website.) Be sure to download the Windows (x86) version (way at the bottom of the page), which includes a Windows Installer.
2. It's time to install SQL Server. The installation file comes as a .zip file. Double click the file. Your unzipping software should open the file and show you a list of files inside the archive. There should be only 1, "Setup.exe". double click the "Setup.exe" file. Once done, you should see a window that looks like below and click “Next>”. Then, follow the rest of the instructions.
3. Once the installation is complete, you will be presented with a "MySQL Sign-Up" window. Click the radio button "Skip Sign-Up", and then click "Next>". For further details of MySQL installation instructions, please go to http://www.wikihow.com/Install-the-MySQL-Database-Server-on-Your-Windows-PC
B. Database and Table Creation steps
1. Open MySQL Workbench
2. Double click on “Local instance mysql”
3. At the top screen, there is a window section called, Query 1. Copy and paste the entire SQL command located on page 3 into Query 1. Then, click execute button located on top left section.
C. Loading Data into Tables
1. Open a fresh query page by clicking the button located at the top left section
2. Open the script attached on this document under SQL folder, InsertCommiteemembers
3. Copy and paste the entire content to the query page and click execute button located on top left section.
4. Repeat the steps above for executing InsertDissertations and InsertExternalPublication scripts.
Page | 1
CREATE DATABASE `cs691-team06-2012`;
USE `cs691-team06-2012`;
CREATE TABLE `committeemembers` (
`Name` char(50) NOT NULL,
`Institution` char(100) DEFAULT NULL,
PRIMARY KEY (`Name`)
) ;
CREATE TABLE `dissertation` (
`authorid` char(5) DEFAULT NULL,
`author` char(50) NOT NULL,
`classyear` char(5) DEFAULT NULL,
`fractionofyearsnotenrolled` char(3) DEFAULT NULL,
`dateofsuccessfuldefense` date DEFAULT NULL,
`monthstocompletion` int(4) DEFAULT NULL,
`dissertationtitle` char(100) DEFAULT NULL,
`committeemember1` char(50) NOT NULL,
`committeemember2` char(50) NOT NULL,
`committeemember3` char(50) NOT NULL,
`committeemember4` char(50) DEFAULT NULL,
`committeemember5` char(50) DEFAULT NULL,
`earlieradvisor1` char(50) DEFAULT NULL,
`earlieradvisor2` char(50) DEFAULT NULL,
`primarysubjectcategory` char(50) DEFAULT NULL,
`secondarysubjectcategory` char(50) DEFAULT NULL,
`tertiarysubjectcategory` char(50) DEFAULT NULL,
`primarymethodused` char(50) DEFAULT NULL,
`secondarymethodused` char(50) DEFAULT NULL,
`tertiarymethodused` char(50) DEFAULT NULL,
`numberofpagestotal` int(4) DEFAULT NULL,
`numberofpageswithoutappendices` int(4) DEFAULT NULL,
`numberoffigures` int(4) DEFAULT NULL,
`numberoftables` int(4) DEFAULT NULL,
`numberofnumberedandcitedreferences` int(4) DEFAULT NULL,
`numberofbibliographydocuments` int(4) DEFAULT NULL,
`titleofexternalpublication1` char(150) DEFAULT NULL,
`titleofexternalpublication2` char(150) DEFAULT NULL,
PRIMARY KEY (`author`),
KEY `fk_titleofexteralpublication2` (`titleofexternalpublication2`),
KEY `fk_titleofexteralpublication1` (`titleofexternalpublication1`),
KEY `fk_earlieradvisor1` (`earlieradvisor1`),
KEY `fk_earlieradvisor2` (`earlieradvisor2`),
KEY `dissertation_ibfk_1` (`committeemember1`),
KEY `dissertation_ibfk_4` (`committeemember4`),
KEY `dissertation_ibfk_5` (`committeemember5`),
KEY `dissertation_ibfk_6` (`earlieradvisor1`),
KEY `dissertation_ibfk_7` (`earlieradvisor2`),
KEY `dissertation_ibfk_2` (`committeemember2`),
KEY `dissertation_ibfk_3` (`committeemember3`),
CONSTRAINT `dissertation_ibfk_1` FOREIGN KEY (`committeemember1`) REFERENCES `committeemembers` (`Name`) ON DELETE NO ACTION ON UPDATE NO ACTION,
CONSTRAINT `dissertation_ibfk_2` FOREIGN KEY (`committeemember2`) REFERENCES `committeemembers` (`Name`),
CONSTRAINT `dissertation_ibfk_3` FOREIGN KEY (`committeemember3`) REFERENCES `committeemembers` (`Name`),
CONSTRAINT `dissertation_ibfk_4` FOREIGN KEY (`committeemember4`) REFERENCES `committeemembers` (`Name`) ON DELETE NO ACTION ON UPDATE NO ACTION,
CONSTRAINT `dissertation_ibfk_5` FOREIGN KEY (`committeemember5`) REFERENCES `committeemembers` (`Name`) ON DELETE NO ACTION ON UPDATE NO ACTION,
CONSTRAINT `dissertation_ibfk_6` FOREIGN KEY (`earlieradvisor1`) REFERENCES `committeemembers` (`Name`) ON DELETE NO ACTION ON UPDATE NO ACTION,
CONSTRAINT `dissertation_ibfk_7` FOREIGN KEY (`earlieradvisor2`) REFERENCES `committeemembers` (`Name`) ON DELETE NO ACTION ON UPDATE NO ACTION
);
CREATE TABLE `externalpublication` (
`titleofexternalpublication` char(150) NOT NULL,
`author` char(50) NOT NULL,
`authorcommitteemember1` char(50) DEFAULT NULL,
`authorcommitteemember2` char(50) DEFAULT NULL,
`authorcommitteemember3` char(50) DEFAULT NULL,
`authorcommitteemember4` char(50) DEFAULT NULL,
`authorcommitteemember5` char(50) DEFAULT NULL,
`othercitinginformation` char(250) DEFAULT NULL,
PRIMARY KEY (`titleofexternalpublication`),
KEY `authorcommitteemember1` (`authorcommitteemember1`),
KEY `authorcommitteemember2` (`authorcommitteemember2`),
KEY `authorcommitteemember3` (`authorcommitteemember3`),
KEY `authorcommitteemember4` (`authorcommitteemember4`),
KEY `authorcommitteemember5` (`authorcommitteemember5`),
KEY `externalpublication_ibfk_1` (`author`),
CONSTRAINT `externalpublication_ibfk_1` FOREIGN KEY (`author`) REFERENCES `dissertation` (`author`),
CONSTRAINT `externalpublication_ibfk_2` FOREIGN KEY (`authorcommitteemember1`) REFERENCES `committeemembers` (`Name`),
CONSTRAINT `externalpublication_ibfk_3` FOREIGN KEY (`authorcommitteemember2`) REFERENCES `committeemembers` (`Name`),
CONSTRAINT `externalpublication_ibfk_4` FOREIGN KEY (`authorcommitteemember3`) REFERENCES `committeemembers` (`Name`),
CONSTRAINT `externalpublication_ibfk_5` FOREIGN KEY (`authorcommitteemember4`) REFERENCES `committeemembers` (`Name`),
CONSTRAINT `externalpublication_ibfk_6` FOREIGN KEY (`authorcommitteemember5`) REFERENCES `committeemembers` (`Name`)
);
Page | 1
IV. DATABASE INFORMATION
A. Committee Members Table
Committee Members table is the first table that we create in the database. This table contains the list of the committee members and advisor names at Pace University. The advisors are working closely with authors’ dissertations from the inception of the topic through all the stages of the proposal and the writing of the dissertation. At the defense, these advisors play a significant role in responding to and approving the final draft of authors’ dissertations.
B. Dissertation Table
Among all the tables in DPS Dissertation database, Dissertation is the most important and informative table. This table contains necessary information from the author’s research and research methodologies. This table not only contains author’s name and dissertation topic but also contains more specific information such as the length of the dissertation, author’s class year and date of successful defense. The dissertation table is the core table in this database because most of the operation and process of data result from this table.
C. External Publication Table
The External Publication table records every external publication that has been cited by the authors. External publication is a way to use other people’s work without plagiarizing. Every author has quite a few external sources on his or her dissertation. Consequently, this information can be taken at the very end of every dissertation file in the reference sections.
D. Constrains
In the Committeemember table, there is only one primary key which is committee members’ names. There are no foreign keys in this table.
For dissertation tables, we have one primary key and seven foreign keys. The primary key on this table is referred to the author’s name and all the foreign keys are referred to the committee members’ names under the Committeemembers table.
In the Externalpublication table, there is only one primary key which is the title of the publication. This table has six foreign keys. The first foreign key references the author’s name under dissertation table while the rest of the foreign keys reference the committee member’s name under Committeemembers table.
V. DATA EXTRACTION METHODS
There are several methods on how we gathered the data, manual extraction and secondary software. Some of the data was given immediately such as author’s name, dissertation title and date of defense. However, most of the data had to be extracted from the dissertation content. As a result, Team 6 is trying to find software that could help in extracting specific information from the content of every dissertation PDF file. We used TiffPDF which is a software tool that identifies pages totals for every PDF file. We are being helped by one of our customers, Rinaldo DiGiorgio, in providing use of other extraction software such as Lingpipe and Mallet to extract information from PDF files. However, he is going to test the software and provide us more information on how to use these tools.
VI. DATABASE TEST OVERVIEW
After we collected all the information that we could get, we populated the data into the database. Towards the final week, we managed to gather all the necessary information to have a complete functioning database. Before moving on to the web interface, we checked our database by conducting sanity and query testing the database. The next section has some of the samples of what we tested on the database to make sure that the database has the correct data, constraints and generated query results properly. After that, we moved on to the web interface design.
VII. DATABASE TEST INSTRUCTION
The next section will elaborate more on database testing. Here’s the instruction to follow and get the same results:
· Open a fresh query page by clicking the button located at the top left section.
· Copy and paste the SQL command written on Italic font.
· Click execute button located on top left section to retrieve the result.
Page | 1
Page | 1
Page | 1
Page | 1
VIII. DATABASE TESTING
A. Sanity Checks
· Committee Member Table (Figure 2)
Ø Query: Select * from committeemembers
Ø Result: 47 records found
Figure 2. Query Result of Committemembers table
· External Publication Table (Figure 3A)
Ø Query: Select * from externalpublication
Ø Result: 42 records found
Figure 3A. Query Result of Externalpublication table
· Dissertation Table (Figure 3)
Ø Query: Select * from dissertation
Ø Result: 67 records found
Figure 3. Query Result of Dissertation table
B. Running Queries
· Query Test 1:
· Finding the shortest dissertation’s duration (Figure 4)
Ø Query: select author, dissertationtitle,
Min(monthstocompletion)
from dissertation
Ø Result: 1 record found
Figure 4. Query Test 1 Result of Dissertation table
· Query Test 2:
· We wanted to find how many committee members were the primary advisors (committeemember1) for the authors out of 46 committee members. (Figure 5)
Ø Query: select name from
Committeemembers where
name in
(select committeemember1 from
dissertation);
Ø Result: 19 records found
Figure 5. Query Test 2 Result of
Committeemembers table
Page | 1
· Query Test 3:
· Finding authors who completed their dissertation less than 3 years (Figure 6)
Ø Query: select author, dissertationtitle
from dissertation where
monthstocompletion <=36
Ø Result: 39 records found
Figure 6. Query Test 3 Result of Dissertation Table
· Query Test 4:
· Finding how many times an advisor was the primary advisor (Figure 7)
Ø Query: select distinct
committeemember1,
count(committeemember1) as
count from dissertation group
by committeemember1
order by count desc;
Ø Result: 19 records found
Figure 7. Query Test 4 Result of Committeemembers Table
Page | 1
Page | 1
IX. WEB INTERFACE DESIGN
Page | 1
The UI (User Interface) was built using PHP, HTML, a CSS grid system, jQuery, along with a few jQuery libraries. The grid system is used in order to avoid having to style every item in order to position throughout the page. Using a grid system helps by placing a class such as grid_6 to specify which container the DOM element will be placed in. jQuery was used because of its powerful selector based system. Using a DOM element’s id one can select an Item the element and manipulate it to one’s specification. jQuery also allows the use of classes for selectors. This allows one to manipulate multiple items using the same class just as it would be done using CSS.
The theme in the UI was built using jQuery’s theme roller. This is used in conjunction with the jQuery framework. Using those selectors styles were added to many different DOM elements by including a UI library and the custom CSS class (Figure 8).