8
Computer Science 360, Project Description
General Information
You need to design, implement, and document a relational database to solve a particular informational problem of your choice. For full credit, you will need to do a Web implementation of your database.
This was the book that was selected as a reference for MySQL and related technologies: Nixon, Robin, "PHP, MySQL & JavaScript", Sebastopol, CA: O'Reilly Media, c. 2009, ISBN: 978-0-596-15713-5. That book recommends using EasyPHP as a simple mechanism for installing a Windows, Apache, MySQL, and PHP development environment. The relevant downloads and instructions can be obtained at http://easyphp.org. You may also develop in an Apple or Linux environment if you want to, and you are not obligated to use EasyPHP if you prefer an alternative.
This is another book which is a reference resource on PHP and MySQL. A copy is on reserve in the library and I have a copy for reference in my office: Luke Welling and Laura Thomson, PHP and MySQL Web Development, Addison-Wesley, ISBN: 978-0-672-32916-6.
This is yet another book which is a reference on these topics. I just have one copy, in my office: Hugh E. Williams and David Lane, Web Database Applications with PHP and MySQL, O'Reilly, ISBN: 0-596-00543-1.
The grading of your project falls into three interrelated categories. First of all, in order to receive credit for your project, you have to do a presentation. However, the presentation itself is not graded. Your performance in front of the class will not be evaluated. The content of the presentation and the grading will stem from two other aspects of the project. The first 8 parts will consist of printed documentation. You will have to hand that in in hardcopy form. A large share of this documentation will naturally result from the implementation. The remaining 4 parts will have to be implemented so that I can check them live.
This statement appeared in the syllabus: The project will be evaluated based on the presence or absence of the required features and the quality of the documentation and implementation. The exact requirements will be spelled out in this document, in the order in which they should be accomplished. Quite frankly, the differential in points earned by different students should be due to attrition. In other words, roughly speaking, one student may complete the first 6 of 12 parts, the next student may complete the first 9 of 12 parts, yet another may complete all 12 parts, earning 60, 90, and 120 points, respectively.
You can derive your project from any source.
· You may be interested in creating a business or organizational informational scenario that is completely original.
· You may also have come into contact with an informational scenario, including a relational database solution to it, through your work, an internship, another class, or some other setting.
· If you don’t have an original idea and have not been exposed to an informational problem from another source, you also have the option of going to the library and looking for examples in books.
· Your choice of project for this class may also overlap with your senior project.
Your project should consist of at least 6 related tables and no more than 10. For people trying to come up with a project from scratch, reaching a total of 6 meaningful tables makes a suitable challenge. For people working from something that already exists, trimming it down to a meaningful subset of no more than 10 tables makes a suitable challenge. The range of 6 to 10 assures that you will be faced with design decisions that are not completely trivial while not being burdened with excessive work. It also means that I will be evaluating projects that should be roughly comparable in size and I will be making grading decisions on quality, not quantity. I also will not be burdened with the excessive work of trying to sort through projects with dozens of tables.
The materials you produce to satisfy this project should be original. If you are working from an existing database with documentation, you will have to tailor what you hand in to the specific requirements given in this document. It will not be acceptable to simply Xerox or print out documentation that already exists. Note that the database management system may have features that essentially allow you to generate the documents requested. It would behoove you to find out about these capabilities in the software and take advantage of them.
All of the textual materials should be prepared using a word processor or some other piece of software (like the dbms) which produces textual output. Tabular output should either be generated by the database or created using the table creation features of a word processor or other piece of appropriate software. Note that it may be possible to take screen shots in some cases rather than generating separate output files. Also note that for one or more of the items below, certain kinds of screenshots will not be acceptable. Those cases are included in the numbered list of requirements. Also note that in theory, a hand-drawn E-R diagram would be acceptable, but additional points will be given for using a graphics program, like Visio, to generate the diagram.
What follows is a categorized, numbered list of project requirements. The list is given in the order in which you should do the items on it. When preparing materials to hand in, do them sequentially and assemble them in the order given. Do not jump around. When grading I will start with item 1 and expect to find each succeeding item in order. The expectation is that those parts that you do do will be done correctly.
There are 12 items on the list, each worth 10 points, accounting for a total of 120 points. With the exception of one item, each point is broken into two subparts or levels. The grading scheme takes this into account. If an item is absent or is manifestly wrong, you will receive 0 points for it. If the accomplishment of the item only does the first subpart or only reaches the first level (labeled A), you will receive 5 points for it. If the accomplishment of the item includes both subparts or reaches the second level, you will receive 10 points for it.
Specific Requirements for What You Have to Hand in
I. Documentation of the Database Overall.
1. One or more paragraphs of text introducing the informational scenario at hand should be written.
A. What is the problem domain? What was the project derived from?
B. What purpose would the database serve? How might it be useful?
2. An entity-relationship model of the database which will support the scenario should be drawn.
A. Hand drawn.
B. Created using a tool like Visio.
II. Documentation of the Tables
3. The design and meaning of the tables should be documented.
A. Schemas for each table in the database. This means a complete list of fields in each table with information about size, type, and format, whether or not nulls are allowed, and whether or not uniqueness is enforced (if they are non-key fields). Primary key fields should be identified. All foreign key fields in tables have to be explicitly identified. All referential integrity support decisions have to be documented and justified. This refers to cascading or restricting updates or deletes, for example. Note that index creation is treated separately in a later point. It will be assumed that you have fully normalized your design. If inspection reveals this not to be the case you will be penalized for a lapse in correctness. See the concluding comment in part B.
Do not try to document the table schemas by printing out table design screen shots from MS Access, for example. Document this by writing full SQL table creation statements for each table in the database. If, by chance, you decide to use the command line interface for MySQL, you will have written these SQL statements and can simply print them out, whether directly or by means of a screenshot. If you used a graphical user interface for MySQL and all of the design information for a table can be shown in a single screenshot, then screenshots for each table can be used for this part of the assignment. Notice the word all. The point is that all of the information for a table design is not shown in a single Access screenshot. The list of all fields and the referential integrity decisions can't be shown at the same time, for example. That's why it's unacceptable. If you can't get all of the information that would be in a complete SQL create statement for a table in a single screen shot, then document this point using SQL or by some other means.
B. A brief verbal description of the entities in the informational scenario and the role they play in it. This verbal documentation is important for the user/reader who isn't familiar with the domain of the database. It's necessary to explain in human terms what the tables are for. Notice that this is part B, not part A. Part A will include more work. You're not eligible for the points for part B unless you do part A. Sometimes it's necessary or desirable for a design to incompletely normalized. If your design has a table that is intentionally not fully normalized, in addition to the general description, include a few written sentences explaining what practical consideration resulted in the incomplete normalization.
4. The contents of each table after being populated with sample data should be printed out. Putting in sample data may seem like busywork, but there are two factors worth considering. First of all, in order to test the various possible combinations of data that the database will allow, a certain number of records is needed. For example, you might like to have foreign key fields with actual values and foreign key fields with nulls. Or you might want to consider various combinations across relationships between more than two tables. More concretely, later on when writing queries, you may want to make sure that you have enough different records so that the queries generate results, or results where the meaning of the query is illustrated by the results generated. The second consideration is simply student interest. If you are truly interested in your project, you may be motivated to enter lots of (or, god forbid, a realistic, practical number of) records. If so, the extra work will be rewarded.
A. 1-5 records in each table.
B. 6 or more records in each table.
5. A data dictionary should be created.
A. Domain documentation. If there are fields in different tables of the database that have the same semantic meaning and the same format, then this defines a domain. You should list these domains along with their meaning. For each domain you should list all of the fields in the database which are on each of these domains. The most obvious examples of two fields being on the same domain are corresponding primary and foreign key fields. Another simplistic example would be an entity address field appearing in more than one different table. Strictly speaking, a single field also defines a domain, but it is not necessary to document these individually.
B. Field documentation. You should make a table of all of the fields in the database where you can look up, at a minimum, which table they're in. (Recall that the same field name may occur more than once in a database if the same name was used in different tables.) A complete lookup table for field documentation would also give the types and lengths of fields and what domain they're on. You may produce this part of the data dictionary using a word processor, but in a fully relational system a table with a name like SYSCOLUMN, which contained this information, would exist in the SYSCATALOG. If you implement your project on such a system, all you would have to do is print this out.
III. Queries, Indexes, Reports, and Forms
6. Basic Queries. The project database should have SQL queries written for it. Along with the SQL, each query should be documented by a written description of what it does or what it is good for, and its use should be illustrated with the output of the query on the sample data that the database tables have been populated with.
A. 4 queries consisting of SELECT blah FROM blah WHERE blah.
B. 4 queries including things like ORDER BY, GROUP BY, etc. These can included any features beyond A, above, but not including those features specifically noted under point 7. Advanced Queries, below.
7. Advanced Queries. The instructions for documenting these queries are the same as the instructions for documenting the basic queries.
A. 4 queries containing joins, 2-way, 3-way, joins with other syntactical features.
B. 4 queries containing subqueries, IN/NOT IN, EXISTS/NOT EXISTS, double NOT EXISTS.
8. Indexes should be created for certain fields in the database. Most systems provide indexes on the primary key field by default. You should implement additional indexes in your database. Whether you used a graphical user interface to create them or not, these indexes should be documented by showing the SQL command which would create them.
A. Create 4 indexes at random.
B. Create 4 indexes (not in addition to the 4 of point A, but instead of random indexes) which are tied directly to one of the queries of the previous points. You should justify these indexes by explaining how the sample queries might have their performance improved because of the index, assuming that the tables in the database had a realistic number of records in them.
IV. Web Aspects
Items 9-12 relate specifically to doing a Web implementation of the project. The goal is to show your ability to manage tools, like PHP, HTML and MySQL, in order to connect a database backend to a Web browser frontend. This is the part of your project that has to be live. As noted already, you can use any resource you want to for learning how to use the software tools. In the interests of clarity of the assignment, these items refer directly to things that are illustrated in the Nixon book.