System Quality
Fall 2002
Professor Arthur Goldberg
Schedule (version 9.11.02)
Week / Date / Topic / Description / Readings / Assignments1 / 9/4 / System quality introduction; Software Development Best Practices, Capers Jones, Part I / Problems caused by lousy computer systems. What is system quality? (ISO 9126 handout). Software and data quality.
Syllabus overview. Readings. Assignments.
Software quality: Software development approaches. Best practices. Peer reviews and software testing.
Data quality: data ownership, data analysis tools and techniques, metadata, data quality rules and data cleansing. / Buggy software still takes a toll,
Business software firms sued over implementation,
Gridlock as 800 traffic lights seize,
Fishermen rescued after dam malfunction
Software problem kills soldiers in training incident
2 / 9/11 / Software Development Best Practices, Part II / Software development best practices, Software development quality, benchmark measures of productivity and defect rates, Benchmark studies, Cost measures: Money, Labor, Quality, best and worst practices / Capers Jones, Sizing Up Software, Scientific American, 1998, 279(6): 104-111.
3 / 9/18 / Software Development Best Practices, Part III / Capers Jones, Conflict And Litigation Between Software Clients And Developers
4 / 9/25 / Peer Reviews of Software and Other Objects / Perform an inspection exercise in class. / Tom Gilb, and Dorothy Graham, Software Inspection, Chapter 3, Overview of Software Inspection / Out: software inspection – what to inspect? Create teams?
5 / 10/2 / Data Quality Introduction / Data Ownership and Data Roles, Cost Analysis of Poor Data Quality, Dimensions of Data Quality, Data models, Data values, Data Analysis Techniques and Tools, Data Quality Improvement, Metadata and Enterprise Reference Data, Domains and Mappings
Data Quality Rules—Definition and Discovery, Data Profiling, Data Transformation and Cleansing: Standardization, Linkage
Duplicate Elimination, and Approximate Searching. / Loshin, parts TBD
6 / 10/9 / Data Quality in Databases, Cost of Low Data Quality & Dimensions of Data Quality / Data models and databases. Data flow. Costs of data defects. The information chain. Domain constraints. Integrity constraints.
Quality of Data Models; Quality of Data Values; Data profiling
7 / 10/16 / Software testing / TBD / TBD / In: Inspection reports; Out: Data profiling
8 / 10/23 / Bob Fitterman: Extreme programming: Vindigo’s Experience / Guest lecture. Fitterman is Chief Technology Officer of Vindigo, the leading supplier of localized information to handhelds and cell phone. He’s responsible for supervising the design of Vindigo’s data management and synchronization systems.
As Bob will discuss, Vindigo develops under the Extreme programming philosophy. In essence, Extreme programming dramatically compresses the design-build-test-deploy cycle. It prescribes: first plan the tests, program in pairs, and release constantly. presents the details. / and all pages one link from it, including continued pages.
and
9 / 10/30 / Data Standardization / Types of data errors: transcription, typing, auditory, etc.
10 / 11/6 / Ken Estes: Software Development Disasters and Weinberg’s Views on Software Development / Guest lecture. Estes is an accomplished software engineer with extensive experience in the engineering and debugging of in-house and third-party applications. He is widely trained and read in Software Engineering theory and practices, with real world experience in their application. He’s author of the current version of Tinderbox, the automated built/test monitoring Software used by Netscape. Estes is designer and author of the run and build time dependency tracking tools in the RedHat 7.0 Package Management system.
Estes will recount software development and deployment experiences he has had while working at high tech companies and financial institutions.
In addition, he will present Gerald Weinberg’s software management philosophy. Weinberg is a leading thinker on the psychology of software development, and the author of several dozen books in the area, including the 4 volume Quality Software Management series. Weinberg incorporates the precepts of family psychology, especially the work of Virginia Satir, into software project organization and management. / Ken recommend links
11 / 11/13 / Project Effort Estimation / Delphi estimation; Construx estimation technique; Perform a Wideband Delphi estimation exercise in class. / Karl E. Wiegers, Stop Promising Miracles
Generic Delphi Estimation Process
/ In: Data profiling
12 / 11/20 / Dr. Ram Chillarege: Orthogonal Defect Classification / Guest lecture. Chillarege was a computer scientist at IBM Research and CTO of Opus360. He invented orthogonal defect classification (ODC) at IBM in the early 90s. ODC is based on the observation that defects can be classified by type, such as design, I/O, formatting, initialization, timing, etc. Defect classification measurements during a software development can accurately indicate the appropriate current development stage (design, coding, unit test, etc.) which can be contrasted to the purported stage. / Ram recommend a paper
13 / 11/27 / Data Matching: Traditional and Machine Learning Approaches / Data Matching (or Linkage): Linkage matches multiple records that correspond to the same real entity, such as {Arthur P. Goldberg, 333 3rd Ave., # 12S, 10010, 212 685-1234} with {Art Golberg, 333 Third Avenue, Apt 12 South, 10001, 212.686.1234}. We present linkage techniques, focusing on the statistical Maximum Entropy machine learning method commercialized by ChoiceMaker Technologies.
14 / 12/4 / Ilya Pevzner guest lecture. Data Merging: A Machine Learning Approaches / Merging merges multiple matching records into a single record. Merging resolves conflicting values, i.e. by deciding that the records above should be merged to {Arthur P. Goldberg, 333 Third Ave., Apt. 12S, 10010, ?}. Ilya will present his PhD research. / Out: take home final, due in 1 week