EG6/WG3/ ModMerise_09 15
date_of_issue : / December 1, 1997Last Update: / July 30, 2003
Reference to be quoted : / EG6/WG3/ ModMerise_09
Status : / final
Author : WG3
CLASET
Merise
Conceptual Data Model
Table of contents
1. OVERVIEW 4
2. BASIC REQUIREMENTS 4
3. READERSHIP 4
4. DESCRIPTION 5
4.1 NOTATION 5
5. THE CLASET DATA MODEL 8
5.1 THE CONCEPTUAL DATA MODEL – SUMMARY DESCRIPTION 8
5.1.1 Introduction 8
5.1.2 CLASET global data model 8
5.1.3 The responsible agency data model 10
5.1.4 The classification part data model 11
5.1.5 The classification links part data model 14
5.2 CONCEPTUAL DATA MODEL - DETAILED DESCRIPTION 20
5.2.1 Introduction 20
5.2.2 Introduction of new concepts 20
5.2.2.1 Subset 20
5.2.2.2 Event_log 20
5.2.3 The classification data model 21
5.2.3.1 DESCRIPTION OF ENTITIES 22
5.2.3.1.1 CLASSIFICATION entity definition 22
5.2.3.1.2 EVENT_LOG entity definition 24
5.2.3.1.3 ITEM entity definition 25
5.2.3.1.4 LEVEL entity definition 27
5.2.3.1.5 RESPONSIBLE_AGENCY entity definition 29
5.2.3.1.6 NAME_AND_ADDRESS entity definition 30
5.2.3.1.7 NAME_COMMUNICATION entity definition 31
5.2.3.1.8 PROPERTY entity definition 32
5.2.3.1.9 PROPERTY_TEXT entity definition 34
5.2.3.1.10 SUBSET entity definition 35
5.2.3.1.11 SUBSET_DEFINITION entity definition 36
5.2.4 The classification links part data model 37
5.2.4.1 DESCRIPTION OF ENTITIES 38
5.2.4.1.1 LINK_SET entity definition 38
5.2.4.1.2 CLASSIFICATION_LINK entity definition 40
5.2.4.1.3 LEVEL_LINK entity definition 42
5.2.4.1.4 ITEM_LINK entity definition 43
5.2.5 The CLASET message data model 45
5.2.5.1 RESOLVING OF THE MANY TO MANY CARDINALITIES 45
5.2.5.1.1 CLASET - DECOMPOSE - CLASSIFICATION 45
5.2.5.1.2 CLASET - CONTAIN - SUBSET 45
5.2.5.1.3 CLASSIFICATION – CLA_RESPONSIBLE - RESPONSIBLE_AGENCY 46
5.2.5.1.4 LINK_SET - LINK_RESPONSIBLE - RESPONSIBLE_AGENCY 46
5.2.5.1.5 SUBSET -SUBS_RESPONSIBLE - RESPONSIBLE_AGENCY 47
5.2.5.1.6 ITEM - TARGET_ITEM - ITEM_LINK 47
5.2.5.2 DESCRIPTION OF ENTITIES 48
5.2.5.2.1 The CLASET entity definition 48
5.2.5.2.2 CLASET_PARTY entity definition 49
5.2.5.2.3 CLASET_CONTACT entity definition 50
5.2.5.2.4 CLASET_COMMUNICATION entity definition 51
5.2.5.2.5 SUBSET_DEFINITION entity definition 52
5.2.5.2.6 CLASSIFICATION entity definition 52
5.2.5.2.7 LINK_SET entity definition 52
5.2.5.2.8 CLASSIFICATION_LINK entity definition 52
5.2.5.2.9 ITEM entity definition 53
5.2.5.2.10 ITEM_LINK entity definition 53
5.2.5.2.11 LEVEL entity definition 53
5.2.5.2.12 LEVEL_LINK entity definition 54
5.2.5.2.13 PROPERTY entity definition 54
5.2.5.2.14 SUBSET entity definition 54
5.3 MESSAGE FUNCTIONALITIES 56
5.3.1 Requesting/Sending General Information on Classification(s) 57
5.3.2 The definition of a Classification and its structure 58
5.3.3 A description of a classification’s link set 59
5.3.4 A description of the modifications made on a nomenclature 60
5.3.5 A complete description of a classification 60
6. CONCLUSION 61
7. GLOSSARY 62
1. OVERVIEW
The systematic classification of some phenomena and the naming of the classes provides the common language which makes consistent communication possible [cited after T.M.F. Smith].
In statistics, classifications or nomenclatures are basic instruments for the efficient collection and analysis of data from economic operators, social institutions, administrations, etc. Classifications are vital for the production of comparable statistical information and for its dissemination.
2. BASIC REQUIREMENTS
The new EDI message CLASET is designed to exchange structured metadata, classifications and tree structures or codes lists and the links between them. These will be subsequently referred to as classifications.
This message covers the following exchange scenarios :
· general information on classifications,
· all or part of the content or structure of a classification,
· data maintenance operations on classifications,
· all or part of the links between classifications,
· and, any combination of the above.
The message has been designed in a generic way and provides mechanisms to describe the nature of the information exchanged within it. The message can be used to exchange requests and responses.
This CLASET message will be used by organisations involved in data maintenance operations on classifications and by users of classifications: for example, to exchange statistical classifications, customs tariffs, product catalogues, organisation charts, tables of links between products, catalogues and official classifications.
3. READERSHIP
This document is intended primarily for people who are designing and implementing classification exchange systems. Standardisation bodies and other administrations such as Customs may also find this document useful.
To get the most out of this document, you should be familiar with the classification exchange problem and the EDIFACT message design procedure.
4. DESCRIPTION
The message to be developed is based on a data model describing, in the case of the data to be exchanged, the entities associated with nomenclatures and their links. The definition of the CLASET data model consists of the following components :
· a conceptual data model,
· a description of entities including attributes and relationship descriptions,
· a description of the functionality covered by this data model.
In order to ease the understanding of this schema, the different concepts will be introduced progressively. They will be illustrated with examples that show how CLASET can handle both simple and complex classifications. The concepts of CLASET will be explained step by step, starting with a high level of abstraction showing how easy CLASET can be, and eventually giving a complete definition that indicates how complex classifications can be handled.
Readers who are not familiar with data modelling techniques and only want to have a general overview of the CLASET data model are invited to read section 6.1 of chapter 6 only. Section 6.2 is intended for readers who are familiar with data modelling techniques.
4.1 NOTATION
The notation used in this model is based on the MERISE methodology. CLASET has also been modelised in UML methodology. The following description is sufficient to enable the reader to interpret the data model diagram.
This graphical element represents an entity, which can be defined as an object for which we want to associate some information.
As shown in this figure, the entity name is written in the upper box.
This second picture represents a relationship between entities, where each entity has a role of its own.
The roles may be shown on the line linking the relationship and the entity. The minimum and maximum cardinalities are represented by two values attached to each of these links. When these two values are in brackets, the unique identifier of the associated entity is composed of its own identifier plus the identifier of the related entity.
In a relationship, the cardinality describes the existence and multiplicity properties of entity occurrences:
- The existence property, represented by the minimum value, specifies wether each entity occurrence has to participate in the relationship to which the entity is connected (where 0 means optional, and 1 means: mandatory (always participate)).
- The multiplicity property, represented by the maximum value, specifies whether or not an entity occurrence can participate in one or more relationship occurrences. The maximum number expresses the maximum number of relationship occurrences in which a single entity occurrence can participate (Conventionally the value n means unlimited number).
Example:
From the above cardinalities, the following statements can be derived :
- Each occurrence of Level, Item and Classification must have one or many relationships to one occurrence of event_log;
- Each occurrence of event_log can be associated to any number of occurrences of Level or Item or Classification but has to be associated with at least one occurrence of one object;
A “(1,1)” cardinality indicates that the entity cannot be identified by itself but its unique identifier is composed of its identifier and the identifier of the related entity.
REMARKS:
The cardinalities shown in this model support all functions of CLASET. A specific functional subset such as the exchange of a classification may have a more precise cardinality defined.
e.g.: When exchanging a classification at least one item for each level is required, but when asking for information about levels of classification no item is required.
In addition, the following representation convention will be used to enhance the legibility of the schema :
/ Objects, filled with up diagonal lines, are related to classification concepts./ Objects, filled with down diagonal lines, are related to the concepts of links between classifications.
/ Objects, drawn in a clear box, describe entities used by both concepts.
5. THE CLASET DATA MODEL
5.1 THE CONCEPTUAL DATA MODEL – SUMMARY DESCRIPTION
5.1.1 Introduction
In order to facilitate the understanding of the CLASET conceptual data model, all the different concepts will be introduced step by step through sub-models going from a high level of abstraction to the concrete definition.
5.1.2 CLASET global data model
The objective of CLASET is clearly to offer the possibility to exchange information and request information on classifications and their tables of links, in a standardised electronic way, in order to facilitate maintenance, dissemination and use of the classification.
Following this statement, a general overview of the CLASET data model can be defined as follows :
Each CLASET exchange message includes at least one classification reference with its responsible agency and, or a set of links between classifications.
An exchange message is defined by :
· the identity of the message e.g. CLASET 1.0
· the identity of the sender, e.g. Eurostat
· the reference of the message, e.g. 00001
· the identity of the recipient, e.g. Statec
· the function of the message e.g. Classification definition
A classification reference is described by :
· the identifier of the classification, e.g. ISO 3166
· the classification version number, e.g. 3.0
A responsible agency can be defined as an organisation which is responsible for the compilation, the dissemination, the maintenance or the reference of classifications and link sets.
A responsible agency is characterised by :
· a code or a name e.g. DIN
· optionally by an address e.g. Burggrafenstrasse 6
Postfach 1107
· optionally by the person of contact e.g. D. Smith
Phone : xxx
Fax : yyy
E-mail : .....
Remark :
A classification and a link set can have any number of responsible agencies.
5.1.3 The responsible agency data model
A responsible agency is identified by its name. It can have one to many persons to contact with different addresses. Each contact person may have different communication numbers such as telephone, fax, Email, etc. The detailed model is shown below :
5.1.4 The classification part data model
Since the classification part can be considered as the kernel of the message structure definition, it is a concept that has to be defined and well understood before describing its structure.
At this stage, there are different approaches for defining the structure of a classification :
- Defining a generic structure :
It is possible to define the structure of a classification with a limited number of generic entities. Whilst this type of model will support any classification structure, it is not a useful base from which to design messages as there will be many ambiguities which will need to be resolved during implementation.
- Defining a fixed structure :
Unlike the previous approach, this will lead to a message structure that is better for acquiring a good understanding and practice, but which may cause inconsistency with some classifications.
- A mixed approach :
This last approach combines the advantage of both solutions without their disadvantages, by offering a fixed structure sufficient for most of the classification and a generic aspect to support more sophisticated requirements.
Based on these findings it is obvious that we will follow the third approach and define a classification structure with a fixed structure and an open door to genericity:
The fixed structure of the classification part which will be the backbone of the CLASET data model consists of the “Classification - Level - Item” entities :
· A classification can be composed of level(s) containing item(s) (or sublevel).
· A level is identified by its identifier combined with the reference of its nomenclature (e.g. NACE REV 1, 4th level).
A level is part of one and only one classification.
· In the same way, an item is identified by its identifier (e.g. NACE REV 1, SECTION A).
An item is part of one and only one level.
The genericity of this sub-model is represented by the entity called “Property” which can be used :
· either to define any property associated with an element of the classification structure (e.g. domain of classification, keywords for a level, ...)
· or to define a structural element which does not fit the fixed structure.
The textual components of the property are represented by the “Property Text” entity.
The relationship "Tree_L" describes the hierarchy between levels within one classification. The relationship "Tree_I" describes the tree structure of the items.
Example :
Remark :
As previously indicated, the mandatory aspect of the cardinalities has to be taken into account, since when requesting general information about a classification, it will not be mandatory to furnish its levels and items.
5.1.5 The classification links part data model
The classification links part represents simultaneously :
· the type of relationship between two classifications: the source and the target;
· the type of relationship between two levels of these classifications;
· the transcodification rules between items of these classifications;
· and the possibility to group these relations and transcodification rules.
This can be graphically represented in the following way :
The “Link_set” entity allows regrouping occurrences of “Classification_link”, “Level_link” and “Item_link” in order to specify common properties and optionnally to give the responsible agencies of the set of links.
The “Classification_link” entity expresses the relationship between the two classifications and is identified by the reference of both classifications : the source and the target.
The “Level_link” entity expresses the relationship between levels of different classifications and is identified by the reference of both levels : the source and the target.