Foreign Language Cataloguing Workshop

26/27 November 2009

Rachel Marsh

Table of Contents

1. Pay and Display...... 2

1.1 Using appropriate fonts and coding

2.Taking the ‘dire’ out of diacritic? ……………………………………… 3

2.1 Special Character Entry

2.2 Special Character Mode

3. MARC my words ……………………………………………………... ……5

3.1. 880

3.2. 066

3.3. 041

4. Red sauce or brown sauce? ...... …..10

Sourcing Foreign language records

4. Foreign Keyboards/Adding Languages to the PC ………………....10

1. Pay and Display

For diacritics and special characters to display successfully in Voyager and your web browser the font preferences of these applications must be set correctly. It is important to check all your preferences before starting work.

1.1 Using appropriate fonts and coding

Voyager Cataloguing client

The font must be set to ‘Arial Unicode MS’

To do this:

·go to Options then Preferences

·The ‘Sessions defaults and Preferences’ window will open

·Go the Colors/Fonts tab

·Select ‘Arial Unicode MS’ from the drop-down font menu.

·Click OK

Browsers

Browsers should also be set to use Arial Unicode MS.

To set the font in Internet Explorer:

·from the Tools pull-down menu select Internet Options

·on the General tab click on Fonts under ‘Appearance’

·select Latin based from the Language script menu

·then select Arial Unicode MS from the Webpage font menu

·click OK then OK again

To set the font in Firefox:

·from the Tools pull-down menu select Options

·in the Content tab underFonts & Colors choose Arial Unicode MS as the Default font

·click on Advanced…

·select Western in the Fonts for menu

·select Arial Unicode MS from the Sans-serif menu

You must also ensure that the character encoding for the page you are looking at is set to Unicode (UTF-8). To check this:

·In Internet Explorerfrom the View pull-down menu, select Encoding and check that the black dot is next to Unicode (UTF 8)

·In Firefox from the View pull-down menu select Character Encoding and check that the black dot is next to Unicode (UTF-8)

The above information is also available in the Newton Help Pages ‘displaying Unicode characters’:

Exercise 1

Check that the font preferences in Voyager Cataloguing, Internet Explorer and Firefox are all set correctly on your PC.

2.Taking the ‘dire’ out of diacritic?

Inclusion in catalogue entries of characters outside the standard alphanumeric set (letters A-Z in upper- and lower-case, numerals, and common marks of punctuation) requires special procedures. Such characters include both diacritics (marks attached to letters to indicate modified sound or value) and special characters (pound sign, dagger etc.).

How do I enter diacritics and special characters in Voyager Cataloguing?

2.1 Special Character Entry

To add a diacritic place the cursor in the space After the character that requires it. From the Edit menu select Special Character Entry (or use shortcut Ctrl+E). A window will openlisting in alphabetical order of name all the diacritic characters Voyager will allow to be entered. Click on the diacritic to be entered then click either Insert to Insert the character and keep the window open, or Insert/Close to insert the character and return to the record. To add a special character follow the exact same steps, only place the cursor where you would like the special character to appear.

Please note that the order in which entries appear in the list of available diacritics is alphabetical, based on a collective decision locally as to the most appropriate way of describing them in order to achieve a sensible and useful output order. They are not the official MARC 21 names for those characters referred to in MARC documentation:

Bug. Sometimes the ‘special character entry’ option in the edit menu appears greyed out and cannot be accessed. If this happens try clicking in the main record window and then going back to the edit menu. If that doesn’t work, try closing and re-opening the record. This is an intermittent bug and unfortunately has no pattern to it.

2.2 Special Character Mode

From the Edit menu select Special Character Modeor use the shortcut Ctrl+D. This changes your keyboard layout. The keys on the keyboard now produce diacritic characters rather than the normal characters. ‘Special Character’ will appear in the bottom right hand corner of the Voyager window to show that you are in Special Character Mode.

Information about which key now produces which diacritic can be found in two places:

1. in the first column in the special character entry window under key press

2. in the cataloguing documentation section of the Libraries@Cambridge website‘list of diacritics/ special characters by character’

Example

To write an e acute ‘é’ the key press or input symbol isb.

Place the cursor to the right of the e to which you wish to add the acute. Turn on the Special Character Mode and press b on the keyboard.

Note that you cannot delete when in SCM.

To deactivate Special Character Mode and return the keyboard to normal, select Special Character Modefrom the Editmenu again, or use shortcut Ctrl+D.

Adding multiple diacritics

Multiple diacritics associated a single character should be handled as follows:

·Enter diacritics from letter outward: 1 Letter 2 Diacritic Nearer 3 DiacriticFarther

·Enter letter with diacritics above and below in this order: 1 Letter 2 Diacritic Below 3 DiacriticAbove

Composition of the diacritic and character

Diacritics and special characters are always one of two types: "spacing" (usually special characters) and "non-spacing" (usually diacritics). Spacing characters occupy their own space when printed or displayed on a screen and non-spacing characters do not.

Example

The ñ shown above is composed of an n plus a non-spacing diacritic tilde. There are two characters but they only occupy one space. To delete both characters you will need to press Backspacetwice.

*Steer clear of Alt Codes*

These work in Windows applications but could cause display problems in Voyager.

*Finally. Don’t forget Copy and paste*

Simply copying diacritics from other Voyager records is as good a method as any!

Exercise 2

In Voyager Cataloguing create a new record with a 100 author field and a 245 title field with the following information:

Author: François Mitterand (diacritic = cedilla)

Title: Mon rêve, ou la Bibliothèque nationale de France (diacritics= circumflex and grave)

Experiment with both Special Character Entry and Special Character Mode to enter the diacritics. Please do NOT save the record to the database but, keep the record open.

3. MARC my words

The following MARC fields are ones to look out for when cataloguing a foreign language item.

3.1. 880

In the current version of Voyager (7.0.2 with unicode) non-roman text ( Japanese, Arabic, Chinese, Korean, Persian, Hebrew, Yiddish etc…) can be input, edited and displayed in any field in bibliographic, holding, or authority records using a standard keyboard. 880 fields are used in the bibliographic record to display the non-roman text.

MARC21 standard definition of the 880 field

“Fully content-designated representation, in a different script, of another field in the same record. Field 880 is linked to the associated regular field by subfield $6 (linkage). A subfield $6 in the associated field also links that field to the 880 field. The data in field 880 may be in more than one script.”

In other words an 880 field contains non-roman script and is linked via a $6to the corresponding primary MARC field that contains the roman transliteration of that script. The two fields form a kind of ‘couplet’ within a record. IMPORTANT:The ‡6 subfields in the primary field and the 880 must be coded correctly and link up for the record to display correctly in the OPAC.

Example:

The following record represents an item that has its author, title and publication data in Arabic.

The primary MARC fields (100, 245, 260) contain the Romantransliteration of the Arabic.

The actual Arabic script is entered intothe 880 fields.

The three 800 fields and the three primary MARC fields form couplets thus:

The 100 field /1st 800field couplet

The 100 field contains the Roman transliteration of the Arabic author name. It starts with a ‡6 subfield linking it to the first 800 field in the record:

‡6 880-01.

‘880-01’ meaning the first 880 field in this record.

The first 800 field links back to the 100 field with a similar ‡6:

‡6 100-01

‘100-01’ = 100 meaning linked to the 100 field, ‘01’ referring again to the fact that it is the first 880 field in this record.

The 245 field /2nd 880 field couplet

The 245 field contains the Roman transliteration of the Arabic title. It starts with a ‡6 subfield linking it to the second800 field in the record:

‡6 880-02.

‘880-02’ meaning the second 880 field in this record.

The second 800 field links back to the 245 field with a similar ‡6:

‡6 245-02

‘245-02’ = 245 meaning linked to the 245 field, ‘02’ referring again to the fact that it is the second 880 field in this record.

The 260 field /3rd 880 field couplet

The 260 field contains the Roman transliteration of the Arabic publishing details. It starts with a ‡6 subfield linking it to the third880 field in the record:

‡6 880-03.

‘880-03’ meaning the third 880 field in this record.

The third 880 field links back to the 260 field with a similar ‡6:

‡6 260-03

‘260-03’ = 260 meaning linked to the 260 field, ‘03’ referring again to the fact that it is the third 880 field in this record.

Additional Codes In the 880 ‡6

Script Identification Code

Records downloaded from other databases may contain codes such as “(3”, “(N”, “(2” or “$1” in the 880 ‡6 after the linking information. This is the Script Identification Code and was required in pre-Unicode records to identify the script in the 880 field. It is NOT required in Unicode records. If it is present in a downloaded record do not delete it, but you do not need to supply it for records you create yourself.

Orientation Code

If the script in the 880 field is one that is written right to left like Hebrew or Arabic, the ‡6 subfield must also contain “/r”. For example, if the 880 field contained text in the Arabic script, the ‡6 might look as follows

‡6 260-03/(3/r

This is known as the Orientation Code and must be supplied for such scripts for them to index correctly. You do NOT need to put the orientation code into the linking primary field (100, 245, 260, etc.), as the information in that field is romanised so is written left to right.

First and Second Indicators in the 880 field

Appropriate indicators as available in associated field. Indicators in field 880 have the same meaning and values as the indicators in the associated field.

Subfield Codes

‡6 linkage (Not repeatable)

‡a-z same as associated field

‡0-5, 7-9 same as associated field

3.2 066

Existing MARC regulations state that if a record contains any characters in a character set other than the default MARC Latin sets, then the record must have an 066 field with the Script Identification Code for the script in a ‡c subfield. The Script Identification Codes are as follows:

Script / Code
Arabic / (3
Chinese, Japanese, Korean / $1
Cyrillic / (N
Hebrew / (2
Greek / (S

For example if the record contained text in the Cyrillic script, the 066 should be coded as follows:

066 ## ‡c(N

3.3 041Language Code (R)

Codes for languages associated with an item when the language code in field 008/35-37 of the record is insufficient to convey full information. Includes records for multilingual items and items that involve translation.

Sources of the codes are: MARC Code List for Languages

Indicators

First - Translation indication

# - No information provided

0 - Item not a translation/does not include a translation

1 - Item is or includes a translation

Second - Source of code

# - MARC language code

7 - Source specified in subfield $2

Most useful Subfield Codes

$a - Language code of text/sound track or separate title (R)

Language code in the first occurrence of subfield $a is also recorded in 008/35-37 (Language) unless 008/35-37 contains blanks (###) or the code "zxx" (No linguistic content).

$b - Language code of summary or abstract (R)
$h - Language code of original and/or intermediate translations of text (R)

Language code(s) for intermediate translations; codes precede those for original languages.
Examples
041##$aeng$afre$aswe
An multilingual item in English, French and Swedish
0411#$aeng$hrus
An item in English translated from the original Russian OR an item in English that includes Russian translation somewhere in the text
0411#$aeng$hger $hswe
An item in English that has been translated from German that has been translated from the original language of Swedish OR an item in English that has German and Swedish translations somewhere in the text!
0410#$aeng$bfre $bger$bspa
An item in English with summaries and/or abstracts in French, German and Spanish.

* Don’t get confused with 040*

Cataloguing Source (NR)

040 $b = language of cataloguing

The MARC code for the language of cataloguing in the record should be entered here, i.e. eng for English. Notthe language of the item being catalogued.

Useful resources

·Marc standards Appendix D – Multi-script records. Contains full record examples.

·Marc standards - 880 Alternate Graphic Representation (R)

·Marc standards – 041 and 040

·Libraries@Cambridge website /documentation/Cataloguing/Cataloguing using non-roman scripts:

Exercise 3

Go back to the record you created in Exercise 2. Add a 041 field. Note that Mitterand's book contains a summary in English but no translations.

4. Red sauce or brown sauce?

Sourcing reliable records for foreign language items can sometimes be difficult. There is unfortunately no rule of thumb.

Foreign language cataloguers at the UL go to the usual sources first: OCLC, LC, RLUK. If these agencies don't have a good record then they try the national library for the relevant country. These national library records can give some help with subject headings even if they do not use AACR2 or MARC21. It is also a good idea to look for a translation in English of the item you are cataloguing. This sometimes comes up trumps.

Some recommended resources:

Greek books in print
- but this does require a knowledge of Greek and of
course doesn't supply a blbliographic record.

NACSIS-CAT is the largest cataloguing records for Japanese monographs and
serials and sometimes it is useful for cataloguing. You can see the
cataloguing records of NACSIS-CAT on the following sites:



Sources for Chinese records but they are not MARC-21:



5. Foreign keyboards / adding languages to the PC

If you want to type large amounts of text in a foreign script it may be easier to add foreign language keyboard options to your PC. Foreign language keyboards could also be used to enter diacritics in a language. Please note that you must have administrator rights on your machine to change languages to do this.

5.1 How to add languages on Windows XP

1.From the Start menu select Settings then Control Panel

2.Double click on Regional and Language Options

3.On the window that opens click on the Languages tab

4.Click on the Details button

5.The ‘Text Services and Input Languages’ window will open. The ‘Installed services’ box will list the languages and accompanying keyboards that are already installed the machine.

6.To add a language click on the Add button

7.A new window will open

8.Click on the down arrow to the right of the “Input language” box. A drop-down list of all the languages it is possible to install will appear. Select the language required then click OK.

9.If the language does not appear on the list, it will need to be installed from the Windows XP CD – please consult your Computing Officer for details on how to do this.

10.For some languages there are several possible keyboard layouts/IMEs (Input Method Editor). In nearly all cases the one the system initially offers as default is the one that is generally used. If at a later date this proves to be not suitable, you will need to add the language again but select a different keyboard layout from the list.

11.To add another language click on the Add button and repeat the process.

12.When all the required languages have been added, click Apply.

13.In the “Text services and Input Languages” window click on the Language bar button. Ensure that the ‘Show the language bar on the desktop’ and ‘Show additional language bar icons in the taskbar’ boxes are ticked.

14.The language bar sits on the taskbar in the bottom right-hand corner of the screen and allows you to move quickly and easily between languages, and also to see which language you are currently using.

5.2 changing the language once added

In order to enter non-English or non-Roman characters, the keyboard on the PC must be changed from one that produces English characters when the keys are pressed to one that produces characters for the specified language or script when keys are pressed.

The language setting is specific to the program that is open when the language is changed. Therefore the first thing you need to do is in the Voyager client or OPAC window, click where you want to the non-English characters to appear.

1.Click on two-letter keyboard language indicator on the taskbar.

2.A list of the languages that have been installed on the PC will be displayed.

3.Click on the language you want to switch the keyboard to. The keyboard language indicator will change to the code for the language you have selected

4.The keyboard has now been changed from an English keyboard to one as it would appear for the language that has been selected, and users can input characters from that language as search criteria using the keyboard.

5.If the language chosen is one that is written right to left (such as Arabic and Hebrew), characters will appear on the screen in this manner (the first character typed in a string will be the first character from the right, and the last character typed will be first from the left).

5.3 Using the on-screen keyboard

When the language on a PC is changed, the keyboard changes from an English keyboard to one as it would appear for the language that has been selected, and users can input characters from that language. As a result, the characters that display when the keys are pressed will no longer correspond with the symbols on the keys on the keyboard. The on-screen keyboard displays the keyboard for the language selected on the screen and allows the user to use it as a map to see what keys they need to press to produce the required characters, or to click onthekey on the on-screen keyboard to input that character. This should be already installed on the PC, but if not consult your Computing Officer.