CICABC and languages management needs.

Ph. Dubois

Sept 2006

Forewords.

The goal of this document is to regroup and to present in a synthetic way functional needs relating to languages management.

Functionalities.

Functionality / Description
Multi lingual document concept. / We are planning to manage multi lingual documents by regrouping the different linguistic versions of same document in an “envelope”. Some meta data will be attached to the envelope and other meta data will be attached to the linguistic version of the document. It will be possible to apply versioning on envelopes level and linguistic document level. Meta data at envelope level will be versioned. Meta data attached to the linguistic version of the document won’t be versionned.
We won’t manage relations like “is a translation of – was translated from”.
Searching / The elements returned by a search will be the envelope not the documents. The search criteria will be applied to the documents content or meta data (specifying a language or not) and the envelope meta data. By default, only last version of document content and envelope will be searched.
If more than one document in the same envelope matches search criteria the envelope will appear only once in the search result.
An optional feature could be to give user the possibility to decide if he wants the envelopes or the documents appearing in the search result.
Dynamically define properties. / An IG Leader should be able to define a set of properties that will be applied on certain type of document in his interest group.
Searching specific IG properties. / The dynamically defined properties should be searchable and specific to the IG.
Stop words depending of the languages for properties and documents. / It should be possible to apply different stop word list depending on the language version of the property. Same remark for the document.
More control on indexing characters. / We would like to control:
  • If indexing will be accent sensitive.
  • If indexing will be case sensitive.
  • How special characters like ‘œ’ will be indexed: are they going to be indexed like ‘o’ followed by ‘e’ or like a single character ‘œ’.
  • ø will be indexed as o or ø?

Search properties and document applying criteria to properties in a specified language. / If a property exists in different languages, we would like to be able to search for a word in the property independently of the language or restricting the search on a version of the property in one given language.
Edit properties in different languages. / It should be possible to specify a property value for a given language. Ex: setting the value of the property “title” in French or in English.
Searching for documents of preceding versions. / It should be possible to specify versions when searching a document(envelope version and document version).