Report to CSC

IFD Library Opportunity

Date: September 25-26, 2006

Location: Lisbon, Portugal

Introduction

This short paper is intended to encapsulate the spirit of discussions held in Lisbon, Portugal regarding an initiative being offered by STABU (Lexicon project) and the Norwegian Barbi project – who are the two founding partners in the Industry Foundation for Dictionaries (IFD) project. In short, the IFD partners are offering the result of approximately $5 million of investment capital to other organizations at no cost, to participate in the continued development of their project into a single unified terminology library that they hope will become a defacto global standard.

History

At ISO meetings in Vancouver in 1999, a "Standing Committee" met with participants from ICIS, IAI, and a variety of other organizations developing IT standards for the building industry (leading to what we are today calling BIM). There was an agreement at this meeting that some sort of standardized global terminology was necessary, and that it's structure must be useful for computers to reliably exchange data irrespective of language.

The result of that discussion, prompted by a push by ICIS, was that the ISO committee TC59/SC13/WG6 was struck to develop the standard now known as ISO 12006-3 – Framework for object-oriented information exchange. This standard defines the rules by which terms can be stored, independent of language, and in a manner which can be understood by dissimilar software applications which are each compliant with the standard.

Once ISO 12006-3 was published, STABU (a non-profit foundation in Holland) Lexicon and Barbi in Norway each began development of terminology databases compatible with the standard. Concurrently, to prove that the concept worked with complementary technologies, those libraries were each tested with IFC compatible databases of real projects. The testing was found to be successful; the two technologies co-existed to provide even more power and flexibility.

In January 2006, STABU and Barbi signed an agreement that they would combine their efforts to produce a single terminology database that they would share between themselves for mutual benefit.

The Technology

Very briefly, a terminology library is like a dictionary. But, whereas a dictionary uses a "phrase" to define a term, computers do not understand phrases, so more information is needed before it can distinguish a particular concept from other concepts. An object-oriented terminology library identifies terms using a serial number (Global Unique ID), and its definition is described by associating the term with other terms. The rules about what "kind" of term it is, and how it must relate to other terms is dictated by the ISO standard (akin to the rules of English, but in computer terms). Computers can then use the unique ID's to accurately exchange information about the concepts, regardless of the names assigned to them.

In practice, this means that objects in a BIM model that are created (or associated) with terms in this library, automatically inherit standard sets of properties as well as pre-defined relationships with other concepts. These are things that a computer cannot be taught, but which are encapsulated in the structure upon which the library is created. By doing so, information can be automatically associated with other relevant concepts automatically, without a human having to make the connection. Likewise, when a human does further refine an object by adding values and descriptions about real objects, those values can all be instantly translated into the other languages available (because humans have already created and validated that the terms are accurate).

The Present

Currently, all of the IT development time and a wide variety of verification testing (to verify the concept) has been invested by the two existing stakeholders to make the terminology library a reality. They have designed the physical database to store the concepts, and developed software tools to allow non-IT industry professionals to populate and verify the terms without having to understand the underlying technology.

Provision has already been made for an infinite variety of languages. An added benefit has been the inclusion of the concept of "dialects", which presupposes that "most" terms would apply except for a selectedsubset of terms. This concept would allow CSC and CCDC (as an example) to refine their own terms with specific definitions, and they would both be related to Canadian English language (or likewise for Canadian French, of course).

The Offer

The IFD Partners are now requesting other parties that have expressed an interest in their work, to join them in populating and using the library for real work in our countries. They IFD partners are willing to share all work to date, in return for assistance by others to add more concepts, and to translate the existing ones, into our various domains. The result is a truly international terminology library that is not owned by any one individual. It is anticipated that a non-profit entity would manage the physical content itself (purchase and manage the servers, backups, update the tools, etc). The consortium members would each be responsible for their own domains. The cost of this entity would, for now, be continued with funding from the existing stakeholders, but that eventually a business plan would be developed to enable this entity to be self-funding.

The end result would be a truly global set of terms and definitions, in which every stakeholder may use the technology that has been developed for their own purposes. They participate by populating (or verifying) terms in their own language and domain. Those terms may be used by applications of interest to ourselves, and will also be available to all other stakeholders. Likewise, terms and concepts defined elsewhere can be reviewed and validated for use by us.

The existing infrastructure will be made available to anyone interested in participating, and copyrights are protected through a General Public License (which in short means that people are free to use it at will, but must not re-sell or profit for personal gain). This means that the terms themselves must not be re-sold; but applications which depend upon it is where the business opportunities lie.

Recommendation

I recommend that CSC and Digicon participate in the IFD Library consortium, to develop the Canadian English version for use in Canada. A variety of reasons support this view:

  • it is unlikely that any other potential competing standard is presently available (there are similar terminology dictionaries under development, but their interested focus is national in scope),
  • the IFD Library is based upon a number of established ISO standards (not just ISO 12006-3, which forms the underlying structure),
  • there are already two founding members, and the meetings in Lisbon revealed interest by everyone in the group (although no contracts were signed here).

As a side comment, Roger Grant from CSI staff attended the Lisbon meeting as well; his impressions mirror mine. As some of you are aware, CSI have already engaged a consultant to prepare a report containing recommendations on how to proceed to build a terminology library for the USA. A copy of this report was shared with me for information, and will eventually be shared with CSC through Mary Friesen. The report will contain recommendations that support the concept of a CSI alliance with the IFD Library consortium. Roger indicated that he would personally support an alliance with CSC to work cooperatively to populate the Canadian and US English portions (99% of which share common terms; the differences being primarily in spelling of names only).

Prepared by,

David Watson, CET