Project 2: Playing with XML DATA

Project 2: Playing with XML DATA

HW4: XML DATA MGMT

The aims of this homework to give you some practice with:

  1. Write XML DTDs or Xshema.
  2. Generate tagged XML data
  3. Write Xquery queries on the data and execute
  4. Compare the XML and RDBMS models of the data

You have the following files:

  1. Faculty.txt Contains information about faculty members, their research interests, where they got their highest degree and when they got it
  2. OfficeHours.txt: contains information about the office hours, and office numbers of the faculty
  3. Phonebook.txt: Contains the phone numbers, office numbers and email addresses of the faculty.
  4. CourseListing.txt: Contains information about the courses offered in the department
  • Task1. You are required to generate an XML file to store this data. This involves
  • Deciding on a schema for the data, and writing a DTD or Xschema for it
  • Translating the raw data into tagged XML data (where the tags are decided by the schema you designed).

In doing the second task, you are encouraged to write a program that tags the data automatically. You can also tag it manually (since the data is small enough).

  • Task2. Express each of the following as Xquery queries on the data you generated in Task 1. Execute them on an Xquery query engine (information below)
  • (Selection/Projection) Generate a file containing the Names of faculty members and their office numbers.
  • (Reformatting) Re-do the previous query so that the ouput is an HTML page which, which rendered by a browser, will list, one per line, the name of the faculty members (in bold), and their office numbers (in italics).
  • (Constraints) Find all the associate professors who got their PHD degree after 1990. Output their names and research interests and email addresses.
  • (Join/Integrate) Find all the professors who are interested in artificial intelligence or databases, and output the name, title, research interests and courses offered.
  • two more queries of your own.

You will run the Xquery using the Xquery tool such as

You are free to use any other Xquery engine. If you have anyqusetion regarding installation etc, contact TA.

  • Task 3. Do Task 1, but convert the data into a relational database format (Tables). Describe the list of tables you will need and their schemas.
  • Task 4. Do Task 3, but write the queries in SQL. In some cases, you may need a union of SQL queries.

Faculty.txt

Stephen S. Yau, Professor and Chair, Ph.D. University of Illinois at Urbana-Champaign, 1961, Software engineering, parallel processing and distributed computing systems, embeded systems.

Edward A. Ashcroft, Professor, Ph.D. University of London (Imperial College), England, 1970, Program verification, declarative language, intensional programming, high-level parallel programming language, language Lucid.

Chitta Baral, Associate Professor, Ph.D. University of Maryland, 1991, Artificial intelligence, multimedia, visualization of databases.

Rida A. Bazzi, Assistant Professor, Ph.D. Georgia Institute of Technology, 1994, Distributed computing, software engineering for distributed systems, fault-tolerance algorithms, computer vision.

Sourav Bhattacharya, Associate Professor, Ph.D. University of Minnesota, 1993, Networked and parallel computing, dependable communication, ATM networks.

K. Selcuk Candan, Assistant Professor, Ph.D. University of Maryland, 1997, Distributed multimedia systems, video servers for video-on-demand systems, content based video indexing, query / retrieval of multimedia data, security, query processing.

James S. Collofello, Professor, Ph.D. Northwestern University, 1978, Software engineering, project management.

Partha Dasgupta, Associate Professor, Ph.D. State University of New York at Stony Brook, 1984, Distributed operating systems, system software.

Joseph DeLibero, Lecturer, M.S. Purdue University, 1972.

Suzanne W. Dietrich, Associate Professor, Ph.D. State University of New York at Stony Brook, 1987, Databases, knowledge management, object management, active features.

Leonard Faltz, Associate Professor, Ph.D. University of California, Berkeley, 1977, Formal linguistics, computational linguistics.

Gerald E. Farin, Professor, Ph.D. Technical University of Braunschweig, 1979, Computer aided geometric design, NURBS.

Nicholas V. Findler, Professor Emeritus, Ph.D. Budapest University of Technical Sciences, 1956, Artificial intelligence, heuristic programming, expert systems, pattern recognition, information retrieval.

Barbara Gannod, Lecturer, Ph.D. Michigan State University, 1997, Pure and applied graph theory, algorithm design and analysis, parallel processing, engineering education.

Gerald C. Gannod, Assistant Professor, Ph.D. Michigan State University, 1998, Software engineering, formal methods for software development, reverse engineering, reengineering, object-oriented analysis and design.

Sumit Ghosh, Associate Professor and Associate Chair for Research and Graduate Programs, Ph.D. Stanford University, 1984, Networking and distributed algorithms.

Forouzan Golshani, Professor, Ph.D. University of Warwick, England, 1982, Multimedia information system, digital video processing, advanced databases, intelligent systems.

Ben M. Huey, Associate Professor and Associate Dean for Planning and Administration, College of Engineering and Applied Science, Ph.D. University of Arizona, 1975, Language-based models for architecture, silicon compilation, design verification, automatic test generation.

Subbarao Kambhampati, Professor, Ph.D. University of Maryland, 1989, Artificial intelligence, automated planning, machine learning.

William E. Lewis, Professor and Vice Provost for Information Technology, Ph.D. Northwestern University, 1966, Analytical modelling, information systems.

Huan Liu, Associate Professor, Ph.D. University of Southern Califonia, 1989, Data mining, Data warehousing, artificial intelligence, machine learning.

Donald S. Miller, Associate Professor, Ph.D. University of Southern California, 1972, Address space operating systems, distributed and multiprocessor operating systems, computer architecture, local area networks.

Faye Navabi, Lecturer, M.S. University of Southwestern Louisiana, 1991.

Gregory M. Nielson, Professor, Ph.D. University of Utah, 1970, Interactive design of curves and surfaces, multivariate data fitting, computer-aided geometric design, computer graphics, visualization of scientific computing.

Pearse O'Grady, Associate Professor and Associate Chair for Undergraduate Programs, Ph.D. University of Arizona, 1969, Parallel processing, computer architecture, continuous system simulation.

Sethuraman Panchanathan, Associate Professor, Ph.D. University of Ottawa, 1989, Multimedia computing and communications, multimedia hardware architectures, VLSI architectures for real-time video processing, indexing / storage / browsing / retrieval of image and video.

David C. Pheanis, Associate Professor, Ph.D. Arizona State University, 1974, Software/hardware interface in embedded microprocessor system, real-time system.

Andrea W. Richa, Assistant Professor, Ph.D. Carnegie Mellon University, 1998, Design and analysis of algorithms, algorithms for distributed networks, graph algorithms, approximation algorithms, combinatorial optimization, distributed network architectures, parallel computation.

Arunabha Sen, Associate Professor, Ph.D. University of Southern Carolina, 1987, Parallel computing, computer interconnection networks, combinatorial optimization.

Wei-Tek Tsai, Professor, Ph.D. University of California, Berkeley, 1986. Software engineering, internet, parallel and distributed processing.

Joseph E. Urban, Professor, Ph.D. University of Southwestern Louisiana, 1977, CASE, computer languages, data engineering, distributed computing, executable specification languages, software prototyping.

Susan D. Urban, Associate Professor, Ph.D. University of Southwestern Louisiana, 1987, Active database systems, heterogeneous database systems, object-oriented database systems.

Michael G. Wagner, Assistant Professor, Ph.D. Technical University of Vienna, 1994, Computer-aided geometric design, geometric modeling and processing, computer animation, theoretical kinematics, robotics and robot dynamics.

Richard Whitehouse, Lecturer, M.S. University of Tennessee, 1985.

Marvin C. Woodfill, Professor Emeritus, Ph.D. Iowa State University, 1964, Digital logic, microcomputers.

Goran Konjevod, Assistant Professor, Ph.D. Carnegie Mellon University, 2000, Design and analysis of algorithms, combinatorial optimization, graph theory, discrete mathematics.

Yann-Hang Lee, Professor, PhD. University of Michigan, Ann Arbor,1985, Real-time systems, computer communication, computer architecture, fault-tolerant computing, distributed/parallel systems, and performance evaluation.

Sandeep K S Gupta, Associate Professor, Ph.D. Ohio State University,1995, Mobile computing, compilers, parallel computing, parallel I/O.

Office.txt

Sheikh Iqbal Ahamed (GWC 328)

M 9:45 - 10:45 am

T 11:30 am - 12:30 pm

Th 4:45 - 5:45 pm

Donald Alpert (GWC 384)

Th 4 - 5 pm

Or by appointment

Chitta Baral (GWC 366)

TTh 4:40 - 5:40 pm

Rida Bazzi (GWC 336)

M 11:30 - 12:30 pm

W 3:05 - 4:05 pm

Th 10:30 -11:30 am

Sourav Bhattacharya (GWC 221)

TTh 11 am - 1 pm

Selcuk Candan (GWC 370)

T 2:30 - 4 pm

W 3:30 - 4:30 pm

Yinong Chen (GWC 228)

TTh 10:30 am - 12:30 pm

James Collofello (GWC 310)

MF 10:40 - 11:30 am

W 12:40 - 1:30 pm

Joseph DeLibero (GWC 382)

M-F 7 - 7:30 am

MWF 9:45 - 10:15 am

TTh 9 - 9:30 am

Suzanne Dietrich (GWC 368)

T 3 - 4 pm

W 9:30 -10:30 am

Th 3 - 4 pm

Leonard Faltz (GWC 314)

W 2 - 3:30 pm

Th 10 - 11:30 am

Or by appointment

Gerald Farin (GWC 340)

TTh 10:30 - 11:30

W 12 - 1 pm

Nicholas Findler (GWC 311)

T 2 - 3 pm

Barbara Gannod (GWC 378)

M 9:45 - 10:45 am, 2:45 - 4:30 pm

W 9:45 - 10:45 am, 2:45 - 3:45 pm

F 9:45 - 10:45 am

Gerald Gannod (GWC 324)

T 10 - 11:30 am

Th 3:15 - 4:30 pm

Or by appointment

Sumit Ghosh (GWC 206)

MTWF 1 - 2 pm

Forouzan Golshani (GWC 351)

MW 2:30 - 3:15 pm

T 11 -12 am

Sandeep Gupta (GWC 224A)

TTh 10-11:30 am

Ben Huey (GWC 374)

TTh 2:30-3:30 pm

or by appointment

Lance Johnson (GWC 354)

MW 6:30 -7:30 pm

Subbarao Kambhampati (GWC 374)

MW 3 - 4 pm

Goran Konjevod (GWC 312)

TTh 4:30 -6 pm

Yann-Hang Lee (GWC 224B)

MW 4 - 5:30 pm

Huan Liu (GWC 342)

MWF 8:50 -10:10 am

or by appointment

Stephanie Ludi (GWC 376)

MW 10:40 am -12:20 pm

Donald Miller (GWC 348)

W 10 - 11 am

Th 12:10 - 1:10 pm

TTh 8 - 8:30 am (EC G236)

Mutsumi Nakamura (GWC 376)

MW 10:30 - 11:15 am

F 10:30 am - 12 pm

Faye Navabi (GWC 380)

MWF 1 - 2:15 pm

T 10:30 - 12 am

Gregory Nielson (GWC 338)

T 6:30 - 7:30 pm

Th 4:15 - 5:15 pm

Or by appointment

Pearse O'Grady (GWC 206)

W 2:30 - 3:30 pm

David Pheanis (GWC 318)

MW 8 - 9 am, 2 - 3 pm, 5 - 6:30 pm

Th 6:30 - 7:30 pm

Perry Reinert (GWC 322)

TTh 6 - 7 pm

Andrea Richa (GWC 344)

T 2 - 3:30 pm

W 10:30 - 11:30 am

Chuck Riden (GWC 382)

MTW 2 - 3 pm

Or by appointment

Mouli Subramanian (GWC 228)

TTh 6 - 7:30 pm

Wei-Tek Tsai (GWC 356)

MW 3 - 4 pm

Renee Turban (GWC 328)

MW 5:50 - 6:50PM

Joseph Urban (GWC 358)

MW 3 - 4 pm

T 11 am - 12 pm

Susan Urban (GWC 372)

T 9:15 - 10:15 am

W 1 - 2 pm

Th 9:15 - 10:15 am

Richard Whitehouse (GWC 322)

MF 1 - 2 pm

TTh 9 - 10 am

Stephen Yau (GWC206)

TTh 4:30 - 6 pm

OFFICE HOURS FOR SPRING 2001

Phone.txt

Faculty Office Phone E-Mail

Ashcroft, Edward GWC 320 5-7544

Baral, Chitta GWC 366 7-6047

Bazzi, Rida GWC 336 5-2796

Bhattacharya, Sourav GWC 334 5-5190

Candan, Kasim GWC 370 5-2770

Collofello, Jim GWC 310 5-3733

Dasgupta, Partha GWC 326 5-5583

DeLibero, Joseph GWC 382 5-1493

Dietrich, Suzanne GWC 368 5-2786

Faltz, Leonard GWC 314 5-1581

Farin, Gerald GWC 340 5-5142

Findler, Nicholas GWC 311 5-5934

Gannod, Barbara GWC 378 5-1757

Gannod, Gerald GWC 324 7-4475

Ghosh, Sumit GWC 206 5-3190

Golshani, Forouzan GWC 351 5-2855

Huey, Ben ECG 107 7-6476

Kambhampati, Rao GWC 374 5-0113

Konjevod, Goran GWC 312 5-2783

Lee, Yann-Hang GWC 224B 7-7507

Lewis, William CPCOM 462 5-0699

Liu, Huan GWC 342 7-7349

Miller, Donald GWC 348 5-5935

Navabi, Faye GWC 380 5-3228

Nielson, Gregory GWC 338 5-2785

O’Grady, Pearse GWC 206 5-3190

Panchanathan,Sethuraman GWC 362 5-3699

Pheanis, David GWC 318 5-7389

Richa, Andrea GWC 344 5-7555

Robbins, Earl GWC 354 7-6910

Sen, Arunabha GWC 347 5-6153

Tsai, Wei-Tek GWC 356 7-6921

Urban, Joseph GWC 358 5-3374

Urban, Susan GWC 372 5-2784

Wagner, Michael GWC 346 5-1735

Whitehouse, Richard GWC 322 5-3983

Woodfill, Marvin GWC 316 5-3689

Yau, Stephen GWC 206C 5-3190

Course.txt

100 Introduction to Computer Science I. (3)

Concepts of problem solving, algorithm design, structured programming, fundamental algorithms and techniques, and computer systems concepts. Prerequisite: MAT 170. Instructor: J. DELIBERO.

110 Principles of Programming with Java. (3)

Concepts of problem solving using Java, algorithm design, structured programming, fundamental algorithms and techniques, and computer systems concepts. Prerequisite: MAT 170. Instructor: N. TADAYON.

120 Digital Design Fundamentals. (3)

Number systems, conversion methods, binary and complement arithmetic, boolean and switching algebra, circuit minimization. ROMs, PLAs, flipflops, synchronous sequential circuits, and register transfer design. Lecture, lab. Prerequisite: Computer Literacy. Instructor: W. HIGGINS.

180 Computer Literacy. (3)

Introduction to general problem-solving approaches using widely available software tools such as database packages, word processors, spreadsheets, and report generators. Instructor: C. RIDEN.

181 Applied Problem Solving with Visual BASIC. (3)

Introduction to systematic definition of problems, solution formulation, and method validation. Computer solution using Visual BASIC required for projects. Prerequisite: MAT 117. Instructor: N. TADAYON.

185 Internet and the World Wide Web. (3)

Fundamental Internet concepts. World Wide Web browsing, searching, publishing, advanced Internet productivity tools.

200 Concepts of Computer Science. (3)

Overview of algorithms, architecture, languages, computer systems, theory. Problem solving by programming with a high-level language (Java or another) . Prerequisites: CSE 100 or CSE 110. Instrcutor: H. LIU.

210 Object-Oriented Design and Data Structures(3)

Object Oriented Design, Static and Dynamic Data Structures (Strings, Stacks, Queues, Binary Trees), Recursion, Searching and Sorting, Professional Responsibility. Prerequisite : CSE 200. Instructor: S. AHAMED.

225 Assembly Language Programming (Motorola). (4)

Assembly language programming, register level computer organization, data structure and addressing modes, assemblers, and linkers. Prerequisite: CSE 100 or CSE 200. Instructor: L. JOHNSON.

226 Assembly Language Programming (Intel). (4)

Assembly language programming, register level computer organization, data structure and addressing modes, assemblers, and linkers. Prerequisite: CSE 100 or CSE 200. Instructor: W. HIGGINS.

240 Introduction to Programming Languages. (3)

Introduction to the procedural (Ada), applicative (LISP) and declarative (Prolog) languages. Prerequisites: CSE 210. Instructor: R. WHITEHOUSE.

310 Data Structures and Algorithms. (3)

Advanced data structures and algorithms, including stacks, queues, trees (B, B+, AVL), and graphs, Searching for graphs, hashing and external sorting. Prerequiste: CSE 210 or MAT 243. Instructor: B. GANNOD.

330 Computer Organization and Architecture. (3)

Instruction set architecture, processor performance and design, datapath, control (hardwired, microprogrammed), pipelining, input/output, Memory organization with cache, virtual memory. Prerequisite: CSE 225 or CSE 226. Instructor: E. O'GRADY.

340 Principles of Programming Languages. (3)

Introduction to language design and implementation, Parallel, machine dependent and declarative language features, type theory, specification, recognition, translation, run-time management. Prerequisites: CSE 240, CSE 310, CSE 225 or CSE226. Instructor: L. FALTZ.

355 Introduction to Theoretical Computer Science. (3)

Introduction to formal language theory and automata, Turing machines, decidability/undecidability, recursive function theory, and introduction to complexity theory. Prerequisite: CSE 310. Instructor: A. SEN.

360 Introduction to Software Engineering. (3)

Software life cycle models, Project management, team development, environments and methodologies, software architectures, quality assurance and standards, legal, ethical issues. Prerequisite: CSE 240 and CSE 210. Instructor: J. COLLOFELLO.

408 Multimedia Information Systems. (3)

Design, use, and applications of multimedia systems, An introduction to acquisition, compression, storage, retrieval, and presentation of data from different media such as images, text, voice, and alphanumeric. Prerequisite: CSE 310. Instructor: F. GOLSHANI.

412 Database Management. (3)

Introduction to DBMS concepts, Data models and languages, Relational database theory, Database security/ integrity and concurrency. Prerequisite: CSE 310. Instructor: K. CANDAN.

420 Computer Architecture I. (3)

Computer architecture, Performance versus cost trade-offs, Instruction set design, Basic processor implementation and pipelining. Prerequisite: CSE 330.

421 Microprocessor System Design I. (4)

Assembly-language programming and logical hardware design of systems using 8-bit microprocessors and micro-controllers, Fundamental concepts of digital system design, Reliability and social, legal implications. Prerequisite: CSE 225. Instructor: D. PHEANIS.

422 Microprocessor System Design II. (4)

Design of microcomputer systems using contemporary logic and microcomputer system components, Requires assembly language programming. Prerequisite: CSE 421.

423 Microcomputer System Hardware. (3)

Information and techniques presented in CSE 422 are used to develop the hardware design of a microprocessor, multiprogramming, microprocessor-based system. Prerequisite: CSE 422.

428 Computer-Aided Processes. (3)

Hardware and software considerations for computerized manufacturing systems, Specific concentration on automatic inspection, numerical control, robotics, and integrated manufacturing systems. Prerequisite: CSE 330.

430 Operating Systems. (3)

Operating system structure and services, processor scheduling, concurrent processes, synchronization techniques, memory management, virtual memory, input/output, storage management, file systems. Prerequisites: CSE 330 and CSE 340. Instructor: D. MILLER.

434 Computer Networks. (3)

Physical layer basics, network protocol algorithms, error handling, flow control, multihop routing, network reliability, timing, security, data compression, cryptography fundamentals. Prerequisite: CSE 330. Instructor: S. GUPTA.

438 Systems Programming. (3)

Design and implementtion of systems programs, including text editors, file utilities, monitors, assemblers, relocating linking loaders, I/O handlers, schedulers, etc. Prerequisite: CSE 421. Instructor: D. PHEANIS.

440 Compiler Construction I. (3)

Introduction to programming language implementation, Implementation strategies such as compilation, interpretation, and translation, Major compilation phases such as lexical analysis, semantic analysis, optimization, and code generation. Prerequisites: CSE 340 and CSE355. Instructor: R. BAZZI.

445/598 Distributed Computing with Java and CORBA .(3)

Frameworks for distributed software components, Foundations of client-server computing and architectures for distributed object systems, Dynamic discovery and invocation. Prerequisites: CSE 360. Instructor: W. TSAI.

446/598 Client-Server User Interfaces.(3)

S, Client-server model for creating window interfaces, Toolkits and libraries such as X11, Microsoft Foundation Classes and Java Abstract Window Toolkit. Prerequisites: CSE 310. Instructor: P. REINERT.

450 Design and Analysis of Algorithms. (3)

Design and analysis of computer algorithms using analytical and empirical methods,complexity measures, design methodologies, and survey of important algorithms. Prerequisite: CSE 310. Instructor: G. KONJEVOD.

457 Theory of Formal Languages. (3)

Theory of grammar, methods of syntactic analysis and specification, types of artificial languages, relationship between formal languages, and automata. Prerequisite: CSE 355.

459 Logic for Computing Scientist I. (3)

Propositional logic, syntax and semantics, proof theory vs. model theory, soundness, consistency and completeness, first order logic, logical theories, automated theorem proving, ground resolution, pattern matching unification and resolution, Dijkstras logic, proof obligations, and program proving. Prerequisite: CSE 355.

460 Software Engineering. (3)

Software engineering foundations, formal representations in the software process, use of formalisms in creating a measured and structured working environment. Prerequisite: CSE 360. Instructor: G. GANNOD.

461 Software Engineering Project 1. (3)

First of 2-course software design sequence, Development planning, management, process modeling, incremental and team development using CASE tools. Prerequisite: CSE 360. Instructor: S. YAU.

462 Software Engineering Project 2. (3)

Second of 2-course software design sequence, Process, product assessment and improvement, incremental and team development using CASE tools. Prerequisite: CSE 461. Instructor: J. URBAN.

470 Computer Graphics. (3)

Display devices, data structures, transformation, interactive graphics, 3-dimensional graphics, and hidden line problem. Prerequisites: CSE 310; MAT 342. Instructor: G. NIELSON.

471 Introduction to Artificial Intelligence. (3)