Data Modelling, Data Visualisation and the Third Dimension
V 5.0
Dermot Duncan
Department of Information Technology
National University Of Ireland, Galway
September, 2010
Abstract
The application of both 3D modelling and data visualisation techniques to current data modellingmethodologies will both simplify the design of complex databases and improve the usability and navigability of complex data models.By using advances in 3D modelling and the representation of 3D objects, the clarity, ease of use and overall model can be dramatically improved. As it is not restricted to single dimensions, a much larger set of data can be displayed clearly, in a much smaller space. Using modern data visualization techniques, data models can be made much more interactive and navigable making them a lot more accessible and usable to a broader spectrum of users. This research proposes a new 3D data modelling approach which incorporates some data visualisation techniques and develops an interactive Flash tool to implement this approach. Experiments are conducted on this tool by a varied set of users to provide a broader spectrum of results.
Acknowledgements
I would first like to acknowledge the support of my supervisor, Colm O’Riordan. He was extremely generous with both his time and patience and offered continued guidance and encouragement throughout my research.
I’d also like to thank all my testers who put up with the initial hiccups on the testing site and generously donated both their time and energy to help me complete my experiments.
Lastly I’d like to thank everyone who helped proof read this document for all their advice and continued patience throughout each version of the thesis.
Contents
Abstract
Acknowledgements
1.Introduction
1.1.Research Overview
1.2.Scope
1.3.Significance of the study
1.4.Research Questions & Hypothesis
1.5.Methodology
1.6.Case Study
1.7.Success Criteria
1.8.Thesis Layout
2.Literature Review
2.1.Introduction
2.2.Database Design Overview
2.2.1.Modelling
2.2.2.Entities & Keys
2.2.3.Mapping & Relationships
2.2.4.Cardinality
2.2.5.Normalization
2.3.Modern Modelling
2.3.1.Relational Database
2.3.2.Entity Relationship Model
2.3.3.Object Model
2.3.4.Differences between relational and object-orientated models
2.4.Human Computer Interaction
2.4.1.Computer Characteristics
2.4.2.Human Factors
2.4.3.Information Visualization
2.4.4.FishEye / Lens Visualization
2.5.Current 3D Modelling Approaches
2.5.1.3D Modelling Processes
2.5.2.PaperVision3D
2.6.Modern Representation of Data
2.6.1.Information Retrieval & Digital Libraries
2.6.2.Kartoo
2.6.3.Intelligent User Interfaces
2.7.Conclusion
3.System Design and Implementation
3.1.Bubble Modelling
3.2.System Design
3.2.1.Architecture
3.2.1.1.Language
3.2.1.2.Framework
3.2.1.3.GUI
3.2.1.4.Middle Tier
3.2.1.5.Database
3.2.2.Prototype Design
3.2.2.1.Initialization
3.2.2.2.Entity Construction
3.2.2.3.Navigation
3.2.2.4.Schema & Entity Creation
3.2.2.5.Schema Retrieval
3.3.System Implementation
3.3.1.Entity Creation
3.3.2.Create Relationships
3.3.3.Schema Deletion, Retrieval & Save
3.3.4.Navigation
4.Experimental Design and Setup
4.1.Websites
4.2.Experimental Design
4.3.User Guide
4.4.Feedback Questions
4.5.Case Study
4.6.Experiment 1 – Navigation
4.7.Experiment 2 – Create New Schema
4.8.Experiment 3 – Update Existing Schema
4.9.Experiment 4 – Find Table Information
5.Results
5.1.Experiment 1 – Getting Started
5.1.1.User Feedback
5.1.2.Trends & Exceptions
5.1.3.Interpretation of Results
5.2.Experiment 2 – Create New Schema
5.2.1.User Feedback
5.2.2.Trends & Exceptions
5.2.3.Interpretation of Results
5.3.Experiment 3 – Update Existing Schema
5.3.1.User Feedback
5.3.2.Trends & Exceptions
5.3.3.Interpretation of Results
5.4.Experiment 4 – Find Table Information
5.4.1.User Feedback
5.4.2.Trends & Exceptions
5.4.3.Interpretation of Results
6.Conclusions and Future Work
6.1.Conclusions
6.2.Future Work
References
List of Figures
Technologies & External Resources
1
1.Introduction
This thesis is concerned with exploring advances in data visualization techniques and seeing what advantages, if any, can be obtained from applying these techniques to data modelling approaches.
In the modern world with all the technological advances in the web, along with the emergence of technical power houses such as Google, there is an abundance of information freely available. One of the challenges with having so much information and data available is how to represent and display this data so it is useful?
Similar issues arise in the visualisation of large databases. How can a database containing hundreds or thousands of tables be modelled or displayed so as to give an accurate picture of the database?
Information retrieval on the web was revolutionized with the emergence of Google and their searching and filtering algorithms. Similarly back in 1976, the modelling of databases was revolutionized by Dr Chen with his entity relationship model.
Both of these approaches still face issues though. This has triggered a lot of advances in the field of information visualization. On the web there are alternative search engines emerging such as KartOO which groups data in easily navigable maps. There are a lot of interactive charts such as goggle charts, loaded with functionality for displaying and efficiently using massive volumes of data.
Data modelling seems to have fallen behind these advances. However, data modelling has similar goals to information retrieval systems such as Google and KartOO and information visualization systems such as interactive charts. One goal of data modelling is to accurately display large volumes of data so it can be used effectively. There is no reason why techniques such as the fish eye lens technique used in interactive charts, cannot also be utilized in modelling a database to create a more visual and navigable model.
This study is concerned with evaluating the improvements that can be obtained from incorporating data visualization and 3D rendering techniques into the process of data modelling. Improvement is a broad term. By redesigning the data modelling methodology and developing a tool which incorporates some modern data visualization functionality, this study aims to measure improvements in the simplicity of the data modelling process as well as the readability and usability of the data models created.
1.1.Research Overview
Object-oriented database systems are designed to meet the requirements of advanced database applications. These requirements can constantly change and evolve over time and so must be managed consistently across all levels of abstraction [1]
-The database level
-The database schema level
-The data model level
This thesis deals primarily with the data model layer. It proposes to evaluate current data modelling techniques and practices, design a 3-D alternative to the classic flat tree-view menu approach and develop an interface to implement this new approach.
The complexity of databases tends to grow over time as tables are added, removed, modified and enhanced. Analyzing these databases is a very time-consuming process and usually requires a high-level of both expertise and familiarity with the schema.
This thesis will evaluate both the advantages and limitations of current 2D data modelling methodologies. Based on these findings it will propose expanding these methodologies to incorporate a new 3D data modelling approach which will include advances in the information visualization field. We hypothesise that improving upon the current 2D methodology will simplify both the design and upkeep of large and complex databases.
An integral part of this research will be to develop a graphically and functionally rich tool to implement this new 3D modelling approach. The tool will allow exploration of complex databases by creating graphical displays of tables for a user to navigate through.
1.2.Scope
This thesis will be limited to investigating database data modelling and will be composed of three main phases
- Evaluating current data modelling techniques
The first phase of this project will be to research current data modelling techniques and outline the key advantages and limitations of these approaches. This phase will also be concerned with researching advances in both 3D modelling approaches and the field of data visualization which has parallels to this study
.
- Designing a new 3D modelling approach to database modelling
This phase of the project will design a new 3D modelling methodology using the research obtained in phase one. It will explore the advantages of 3D modelling over the current flat approach. The rules and outline of the new methodology will be defined in this phase including representing tables, elements, attributes and relationships in 3D.
- Developing a RIA Interface for the new 3D modelling approach
The third phase of this project will be to design and develop a functionally and visually rich interface to implement the 3D modelling approach designed in phase two. The tool will allow the creation of new database schemas along with the ability to manipulate schemas created by the tool. Based on time constraints this could be enhanced to also read in existing schemas.
1.3.Significance of the study
This study aims to provide an alternative approach to current data modelling methodologies. This is an area which has evolved significantly over the past number of years. From Chen’s proposal of entity relationship models, to normalization to UML modelling, database modelling techniques have been constantly changing. Substantial improvements in the modelling process have been madewith the introduction of software packages such as IBMs Rational Rose and Microsoft’s Visio packages. However these all have the same fundamental flaw. There will always be limitations to how much data can be displayed on a screen at one time. The larger and more complicated a database gets, the greater amount of real estate the model will take up. The more relationships and complexity added to the database, the more cluttered and over populated the model becomes. This is a problem which has plagued other fields including both information retrieval and data visualization systems.
The ability to more intuitively model the data would greatly improve its usability for users. By incorporating some simple techniques such as 3D spatial navigation or
Fig 1.1
3D modelling and more modern data visualization techniques offer a solution to this problem. It has the advantage of an extra dimension so a much larger set of data can be displayed clearly, in a much smaller space. Multiple relationships on single entities could be displayed much more cleanly and effectively. Also, utilizing data visualization advances in similar fields would allow simpler navigation and analysis of larger and complex schemas and greatly simplify the modelling process. The advantage of this to any corporation would be time saved in both the analysis and design of complex schemas.
Fig 1.2
1.4.Research Questions & Hypothesis
This section briefly introduces the research questions this study aims to answer along with the hypothesis this research will prove or disprove.
What are the limitations of current data modelling techniques?
To improve the data modelling process we first need to ascertain the exact limitations of this process.
What would the advantages be to adopting a 3D approach to data modelling?
Once we’ve discovered the limitations of the current modelling approaches we will investigate how these can be improved by adopting a 3D approach
How would modern data visualization techniques improve and simplify data modelling?
Building on the above, we aim to discover how the use of modern data visualization techniques could improve the usability of data models, making them more accessible to a broader audience.
Is it possible to represent a complex database with a simple and easily navigable model?
We will perform an analysis of the affect of adopting the new modelling technique with the data visualization functionality available in the prototype has on a complex model and whether it can simplify a complex design.
The thesis hypothesizes that current data modelling techniques create complex models to represent complex databases. Withbeen restricted to a 2D realm models take up a much greater area than they need to. These models are not easily navigable and require a certain level of both expertise and familiarity with the database been modelled to be useful to a user. By evolving the models into the 3D realm and incorporating some modern data visualization techniques and functionality into a new modelling tool, complex database can be depicted by simple models. The user will be easily able to explore these models, simplifying any updates, enhancements or manipulation of the model.
1.5.Methodology
This research involves the practical implementation of a 3D modelling tool. As outlined in the scope this involves three key phases which will be developed using an agile design approach. Each of these phases has a dependency on the previous phase; however each phase will be broken into use cases and built up over iterations.
In phase 1, an ethnographic research approach would be useful in evaluating current modelling techniques. According to Spradley (1979), ethnography is "the work of describing a culture". He emphasizes, however, that "rather than studying people, ethnography means learning from people [2]”. There is a traditional culture of data modelling today which uses a flat 2D tree-view menu to display database schemas.
Understanding the reasons behind this main-stream trend as well as why data visualization techniques in similar fields have not been made in the data modelling field will be key to the success of this research. As well as using Ethnographic studies to ascertain the reasons behind the popularity of the 2D tree-view modelling approach, it could also be used to outline the key rules and requirements a new 3D approach would have to adhere to.
We will use a case study to investigate the evolution of current database modelling methodologies. This allows greater freedom to create a hypothesis. The case study used is described briefly in the next section. The end product of this phase will be thedesign and implementation rules of the new 3D methodology. This will then be used to help prove or disprove the hypothesis. In the experiments, which are discussed in greater detail in Section 5, a comparison between the new and old methodologies will be conducted to ascertain the advantages, if any, offered by the 3D approach.
An agile approach will be used for the development phase of this research. The outcome of this phase will be a prototypeto allow 3D exploration of databases. This will both implement the new methodology created in Phase 2 above, as well as providing some data visualization techniques to navigate through the new model. The first task to this phase will be prioritizing the functionality which is essential for the success of the project.
1.6.Case Study
As the thesis moves through development iterations, this case study will be expanded upon but the following is the basis for the case study. Design a relational database to support a modern payroll system.
The system must contain some key elements. All an employees details should be accessible by the system. There are different types of employees. Some are on an hourly salary and so qualify for overtime while others are on an annual salary and cannot earn overtime. Employees on an annual salary are eligible for a bonus once a year which is a percentage of their annual wage. Employees can receive their pay by either a check or straight into their bank account through direct deposit.
This system must be able to highlight where money is been spent. To achieve this, a record must be kept of the projects an employee bills their hours to. Each project is composed of multiple tasks and the system will be expected to support this granularity.
Deductions are taken from each pay check an employee receives. This can include taxes, health insurance and other miscellaneous deductions. A record needs to be kept of the deduction amount from each pay check as well as the total yearly deductions.
This case study was chosen as it is a very adjustable model. It can start off relatively simple but can grow into a very complex system. This will allow the new methodology put forward by this thesis to be tested on multiple levels from creating a new model to enhancing a relatively straight forward model to support a much more complex system. The case study is discussed in more detail in Chapters 3 and 4.
1.7.Success Criteria
This study will be deemed successful should the steps outlined in the scope be met:
- A sufficient investigation into current data modelling practices is conducted with both advantages and limitations documented. Data visualization techniques which have parallels to this research are identified and researched.
- A workable 3D data modelling methodology has been successfully designed and documented.
- A functionally rich tool has been developed to implement the aforementioned 3D design. The design of this should also be documented
- An academic thesis document is prepared which adheres to MSc requirements and consolidates the findings from each of the above 3 phases
1.8.Thesis Layout
As described in the scope above this thesis is structured so as to gradually approach the goal of a completely functional 3D data modelling tool. First, relevant literature is examined to order to establish and clarify the concepts and approaches which will be used. Topics covered will include database design and data modelling. Some history as well as the key concepts of both will be investigated and briefly introduced. The focus will then move on to 3D representation and visualization of data. A brief review of current approaches will be conducted, highlighting their advantages and limitations. Prior to this, there will be some discussion on the HCI aspects of representing these models.
Next we will move into our design Chapter. This will outline the design and implementation of both the new 3D methodology as well as the data visualization tool used to implement this methodology. Justification and explanation of design decisions and approaches will be documented.
We will then move into our experimental design section. This will outline the design of the experiments used in our examination of the hypothesis and justify why they were chosen.The next Chapter providesa detailed description of the results and findings of the experiments and whether the hypothesis was proved or disproved.
The conclusion will evaluate the progress and success of the project and what contribution it might provide to other research. A broad explanation of any future work will also be provided.
2.Literature Review
2.1.Introduction
This section begins with taking a look at a broad overview of modern database design followed by a brief exploration of some current data modelling approaches. The importance of HCI in modern systems is then briefly examined with an emphasis on data visualization. This is followed by a broad overview of modern 3D modelling techniques and a brief review of current approaches in existing tools. Finally there is a section briefly describing some other approaches to information visualisation which have parallels to this thesis.