Wise Words on Data Modeling
Graeme Simsion, Simsion & Associates / University of Melbourne[1]
Introduction
Let me start with a minor confession: the thing I enjoyed most about writing the latest edition of Data Modeling Essentials was choosing the quotations for the beginning of each chapter. There’s something curiously satisfying about finding that some towering historical figure or legendary intellect has, in a few words, captured the essence of a key data modeling concept – especially when they obviously weren’t talking about data modeling at the time. And of course, it’s an opportunity to show that one’s reading isn’t entirely confined to database manuals. My co-author, Graham Witt, firmly focused on the serious stuff, was happy enough to leave it largely to me.
Here, for your amusement, and hopefully as an easily-remembered set of data modeling ideas, are the quotes[2] which open each of the seventeen chapters. I’ve added a few words of elaboration to each – enough, I hope, to allow me to legitimately claim authorship of this article, and give you something new if you already own the book.
Chapter 1: What is Data Modeling?
Ask not what you do, but what you do it to. - Bertrand Meyer
Bertrand Meyer is one of the pioneers of the object oriented movement, and inventor of the Eiffel language. About once a week I get an email from someone saying “our managers don’t believe we need data modeling”, and often the reason given is that “we’re using OO techniques, and the OO people don’t want you guys involved.”
So it was nice to be able to quote someone from the OO side of the fence as recognizing the importance of data organization. I should add that Meyer is not alone: the OO people who really know what they’re doing understand the importance of data structure as the foundation of data-centric applications, regardless of whether they are developed using traditional or OO techniques and tools. Someone has to specify that data structure – and that is what data modeling is about. The job doesn’t go away, even if the people who do it are not called “data modelers”.
Chapter 2: Basics of Good Structure
Begin with the end in mind
- Stephen Covey, 7 Habits of Highly Effective People
Introducing the relational model – and particularly normalization - so early in the book has always been a source of controversy, with some reviewers believing that we should take a top-down approach. Years of teaching in industry and academe have convinced Graham and me that people learn better if they understand what they’re working toward - what a good deliverable looks like. This doesn’t just apply to students; some practitioners recruited from non-IT backgrounds and content to work at the conceptual end of the process would improve their credibility and effectiveness if they had a better knowledge of logical database design.
Chapter 3: The Entity-Relationship Approach
It is above all else the separation of designing from making and the increased importance of the drawing which characterises the modern design process
- Bryan Lawson, How Designers Think
I have been outspoken about my contention that data modeling is better characterized as design than analysis, and that architecture is an excellent metaphor for data modeling. The Data Administration Newsletter includes a recent summary of my position[3]. If you agree with me, you are likely to find Bryan Lawson’s classic text on design, which draws heavily on his background in architecture, fascinating reading.
Chapter 4: Subtypes and Supertypes
There is no abstract art. You must always start with something. Afterward, you can remove all traces of reality. - Pablo Picasso
Experienced data modelers often propose entities and attributes at higher levels of generalization (abstraction) than other stakeholders are comfortable with, prompting criticism of their models as esoteric and impractical. The challenge for professional modelers is to be able to work at different levels of generalization , to understand the tradeoffs – particularly between rule enforcement and stability – and to be able to explain what they mean.
Chapter 5: Attributes and Columns
Sometimes the detail wags the dog - Robert Venturi (Architect)
If you’ve spent some time around databases, from those supporting major applications through to local spreadsheets, you’ll know that column (or data item) definition is frequently done poorly – and with significant impact. Data modelers don’t always offer as much guidance as they should here. Yet, with the increasing use of packaged software, and approaches to integration based on messages rather than shared databases, this is perhaps the area in which data modelers’ contributions will be most needed.
Chapter 6: Primary Keys
The only thing we knew for sure about Henry Porter was that his name wasn’t Henry Porter. - Bob Dylan & Sam Shepard, Brownsville Girl[4]
Continuing the subject of poor database design, in my experience, the single most common serious design error is poor choice of primary keys. And surrogate keys do not provide a “no brainer” solution.
Chapter 7: Extensions and Alternatives
The limits of my language mean the limits of my world
- Ludwig Wittgenstein, Tractatus Logico-Philosophicus
One of the frustrations for data modelers who would like to see our body of knowledge grow is the attention that academic and practitioner research has given to devising and comparing modeling languages and conventions, rather than on using them well. However, knowledge of the constructs supported by alternative languages (UML, Chen E-R, ORM, etc) can give us some valuable additional tools for describing and analyzing problems and data structures.
Chapter 8: Organizing the Data Modeling Task
The fact was I had the vision… I think everyone has… what we lack is the method.
- Jack Kerouac
The fact is – data modelers don’t agree on the stages of data modeling and what fits within each stage. This is not disastrous, but we do need to be able within our organization to provide project planners and managers with a clear picture of where data modeling fits, and what the stages and deliverables will be. A good place to start is with the question “where will the specification for the database come from?”
Chapter 9: The Business Requirements
The real voyage of discovery consists not in seeking new landscapes but in having new eyes - Marcel Proust
Some modelers recognize a formal phase to identify business requirements – others aim to pick them up in the conceptual modeling phase. There’s more to understanding business requirements than asking “what data do you use?” If there’s a single piece of advice I’d give about this phase (not just to data modelers but to all business analysts), it’s “don’t rely on interviews with centralized management: get out and see the real users.”
Chapter 10: Conceptual Data Modeling
If you want to make an apple pie from scratch, you must first create the universe
- Carl Sagan
Designers in other disciplines work with patterns, and I observed data modelers doing the same thing many years ago. They are our most important resource in conceptual modeling: we need to remember what we do, look at others’ models with a view to learning - and own the books of patterns compiled by David Hay and Len Silverston.
Chapter 11: Logical Database Design
Utopia today, flesh and blood tomorrow - Victor Hugo, Les Miserables
One of the toughest questions for the data modeler as they move from requirements to specification is: “to what extent should I take into account the features of the target DBMS?” Today not all relational DBMSs support the same set of logical structures – and if we fail to take extended features into account, then we are not playing with a full deck.
Chapter 12: Physical Database Design
“Necessity is the mother of invention” is a silly proverb. “Necessity is the mother of futile dodges” is much nearer the truth. - Alfred North Whitehead
The boundary between logical and physical database design is crucial – because it usually entails a handover from data modeler to DBA. I believe that the data modeler should own the conceptual schema - the base tables as implemented - and no change should be made without his or her involvement and approval.
Chapter 13: Advanced Normalization
Everything should be made as simple as possible, but not simpler
- Albert Einstein (attrib)
There are far too many misconceptions about the normal forms beyond 3rd Normal Form – and data modelers often don’t know enough to deal with them. The issues are real and important, but most of them can be understood by learning a few patterns.
Chapter 14: Modeling Business Rules
He may justly be numbered amongst the benefactors of mankind, who contracts the great rules of life into short sentences. - Samuel Johnson
On reflection, I would have preferred to use a quote from another Englishman, Lord Denning, who declared that he preferred setting legal precedents to following them. In dealing with business rules, it’s easy to forget that most of them are human constructions, often reflecting assumptions about what is possible. Knowing what is technically possible, we should be proactive in questioning them and proposing alternatives.
Chapter 15: Time-Dependent Data
History smiles at all attempts to force its flow into theoretical patterns or logical grooves; it plays havoc with our generalizations, breaks all our rules; history is baroque.
- Will Durrant, The Lessons of History
Data modelers dealing with time-dependent data cannot apply a single general rule: once again, the approach is to learn a set of patterns and to choose the best one according to the circumstances.
Chapter 16: Modeling for Data Warehouses and Data Marts
The more constraints one imposes, the more one frees oneself of the chains that shackle the spirit. - Igor Stravinsky, Poetics of Music
The first edition of Data Modeling Essentials (in 1994) did not even mention data warehouses and marts. Now they warrant books of their own, and many data modelers have never worked on “conventional” transaction processing databases. For traditional modelers, though, they pose a challenge – and many see the techniques used (particularly star schemas) as unacceptable restrictions on their ability to model the business world. I would argue that this is more a matter of mind-set than anything else, and that a good traditional modeler can readily transfer skills to the warehouse environment.
Chapter 17: Enterprise Data Models
Always design a thing by considering it in its next larger context – a chair in a room, a room in a house, a house in an environment, an environment in a city plan
- Eliel Saarinen
I am a cynic when it comes to enterprise modeling – but I took the easy way out with a quote that reinforces the traditional view. Graham Witt, noting my Bob Dylan quote, demanded equal time for Led Zeppelin and “The song remains the same[5]” would have met his request and conveyed my views better. We can’t talk about enterprise data modeling without talking about enterprise data management - and this is the hard part!
A Final Quote
Finally, here’s one we didn’t use – a few words on our profession from the most quotable source of all:
And as imagination bodies forth
The forms of things unknown, the poet’s pen
Turns them into shapes and gives to airy nothing
A local habitation and a name.
William Shakespeare,A Midsummer Night’s Dream
The third Edition of Graeme’s book Data Modeling Essentials (written with Graham Witt) was released by Morgan Kaufmann in November 2004.
[1]
[2] In some cases in which there were two quotations, I have included only one here.
[3] “You’re Making it Up: Data Modeling – Analysis or Design?”, TDAN, 1st Quarter, 2005.
[4] 1986, Special Rider Music
[5] From the song and album of the same name.