Audit report on JUDAICA content including metadata /

ECP-2008-DILI-538025

JUDAICA Europeana

Audit report on JUDAICA content
including metadata

Deliverable number / D2.1
Dissemination level / Public
Delivery date / 30 April 2010
Status / 1st Version
Author(s) / Rachel Heuberger UB-FFM, Louise Asher JML, Zanet Batinou JMG, Giuliana De Francesco MIBAC, Jean-Claude Kuperminc AIU and Gilles Rozier Medem-MCY, Dov Winer, Lena Stanley-Clamp EAJC, Pier Giacomo Sola AMITIE, Zsuzsanna Toronyi MZSML

eContentplus

This project is funded under the eContentplus programme[1],
a multiannual Community programme to make digital content in Europe more accessible, usable and exploitable.


Table of Contents

1. Summary 3

1.1 The Purpose of Work Package 2 3

1.2 Overview of the deliverable 3

2. Content Identification and Selection 4

2.1 How the survey was carried out 4

2.2 The survey and guidelines 4

2.3 Collection content to be digitized surveyed 5

2.4 Collection content already digitized surveyed 7

3. Metadata Survey and Alignment 8

3.1 The Survey and Guidelines 8

3.2 Results of Metadata Survey 9

3.3 Use of Vocabulary for Subject Description 10

3.4 Export Formats 11

4. Standards of Metadata Schema 12

4.1 Describing Standards 12

4.2 Information schemes (metadata) 12

5. Conclusions 16

5.1 Content to be digitized 16

5.2 Metadata Survey 16

5.3 Use of Controlled Vocabulary 16

5.4 Future work of WP 2 16

1. Summary

1.1 The Purpose of Work Package 2

Work Package 2 of the Judaica Europeana project (WP2) is tasked with:

1.  Content identification and selection by means of auditing, assessing and selecting content to be digitised at the partner institutions collections and auditing in detail the available digitised resources. Establishing an advisory group of thematic domain experts that will support the process of content selection according to set criteria;

2.  Surveying the existing metadata schema used currently by the partners and facilitating the mapping of those standards to a common metadata standard;

3.  Assessing the requirements for the adaption of controlled vocabularies for Judaica purposes;

4.  Producing tools to support the conversion of the partners’ data into the common harvesting format for ingestion into the main Europeana service.

5.  Establishing a pilot knowledge management system to support the community of practice of scholars and cultural heritage professionals in the thematic domain area.

WP2 works together with other work packages in the project. In particular, WP2 works closely with WP3 and WP4: feeding information about standards for their work. The survey which is the basis of this deliverable was extended to include collecting information on IPR issues for use within WP4.

1.2 Overview of the deliverable

This deliverable is the first outcome of this work and is the result of content identification based on a survey of the content that partners contracted to provide to Europeana through the Judaica Europeana project.

These collections are described, in outline, in the Description of Work for the project (pp10-33).

The first part shows the results of the Judaica Europeana Survey that was carried out by the partners in order to provide an accurate image of the current state of Judaica content. The survey shows in detail the resources to be digitized and the available digitized material. In addition, the survey gives detailed and accurate information about the metadata schema used in the partners’ institutions.

The second part presents an overview of existing metadata schema that will be useful for understanding the rest of the deliverable. It is important to present the range of key standards being used by various types of institutions represented by the partners in order to provide an overview of existing standards and convey their multiplicity. Thus the partners’ institutions, being a mixture of archives, libraries and museums with Judaica content, are characterised by a large diversity of standards.

2. Content Identification and Selection

2.1 How the survey was carried out

This chapter gives detailed descriptions of the collection content surveyed for this deliverable. The information was obtained from the answers to the survey that was sent to the content providers of Judaica Europeana.

The questions of the survey were taken according to the list of elements defined in the Metadata Mapping & Normalisation Guidelines for the Europeana Prototype, Version 1.2.1, 18/01/2010, Europeana v1.0, chapter 2.1. [2]

The content providers were asked to give detailed information of their collections to be digitized as well as of those that are already digitized. The questionnaire with explanations for the content providers and the results of this survey are reproduced below.

2.2 The survey and guidelines

The content providers were asked to answer the following questions:

1) Which collections will be digitized in this project?

Fill in Chart I – Content to be digitized

Chart I: Content to be digitized

Category / Description / Fill In
Provider / Name of institution
Type of Object in detail / Text -printed or Manuscript, in book, periodical, as single page, or Image like photos and postcards, or Artifact like textiles, coins, instruments, or sound records, videos, films ….
Quantity / State clearly if you mean books, pages, film clips etc For each type separate line
1, 2, 3, / Example: 1) 1000 postcards
2) 45 pages of letters …
Format and Quality / In relevant cases like images, films or sound records, - JPEG, MPEG. HTML etc
Quality: Resolution, sampling rate, colour/grey scale etc
IPR / Only Public domain


2) Which collections are already digitized?

Fill in Chart II – Content already digitized

Chart II : Content already digitized

Category / Description / Fill In
Provider / Name of institution
Type of Object in detail / Text -printed or Manuscript, in book, periodical, as single page, or Image like photos and postcards, or Artifact like textiles, coins, instruments , or sound records, videos, films ….
Quantity / State clearly if you mean books, pages, film clips etc. For each type separate line
1, 2, 3, / Example: 1) 1000 postcards
2) 45 pages of letters …
Format and Quality / In relevant cases like images, films or sound records, - JPEG, MPEG. HTML etc
Quality: Resolution, sampling rate, colour/grey scale etc
IPR / Only Public domain

2.3 Collection content to be digitized surveyed

Audit, assessment and selection of content by the partners to be digitized

Provider / Type of object / Quantity / Format and Quality
UB FFM / Books / 1000 title pages 100.000 pages / Paper Black/white
Illustrations / 100 pages / Paper Black/white
coloured
Alliance Israelite Universelle / Archives / 300000 pages / paper
Newspaper / 270000 pages / paper
Photographs / 10000 photos / paper
Books / 200000 pages / Paper
Manuscripts / 1000 pages / parchment
Medem / Photographs / 1000 photos / Paper
Newspapers / 70000 pages / Paper
Music / 3500 tracks / Cassettes, vynils
Printed music scores / 5000 pages / paper
Newspaper / 270000 pages / paper
Jewish Museum of Greece / Photographs / 10000 photos / paper
Books / 200000 pages / Paper
Manuscripts / 1000 pages / parchment
Manuscripts / 1000 pages / parchment
Religious Artifacts / 596
Textiles / 403
Costumes / 478
Domestic Artifacts / 477
Personal Objects / 400
Ephemera / 99
Etchings / 94
Photographs / 2673 / black/white
Coins/medals / 125
Architectural Elements / 27
Contemporary Artworks / 47
Documents:
Arditis Archive -
Patras Archive –
Molho Archive –
Florentin Archive –
Yoel Archive – / 230 pages
2740 pages
110 pages
60 pages
270 pages,
45 photos / black/white
Hidden Children in occupied Greece – / 340 images / JPEG
Jewish Neighbourhoods of Greece – / 150 images / JPEG
WW II and the Holocaust in Greece – / 376 images / JPEG
Holocaust Survivors’ Personal Testimonies
Young People in the Maelstrom of occupied Greece / 55 images
420 images / JPEG
JPEG
Municipal Museum of Ioannina – Judaica Collection / 270 items
Hungarian Jewish Archives / Books / 4.500 pages
Manuscripts / 1761 pages
Postcards / 882 postcards / two sided = 1764 images
Ministero per i Beni e le Attivita Culturali
1) Bibliotheca Palatina di Parma / Manuscripts / 80 manuscripts
? pages / Paper/parchment
2) Archivio di Stato di Venezia / Manuscripts and printed texts / 42.000 pages
Jewish Historical Institute, Warsaw / Manuscripts and texts / 320.000 pages / Paper
photos / 100 / Black/White
The Jewish Museum London / Audio recordings / 150 separate recordings / MPEG audio files and PDF transcripts
Images (photographs, illustrations, paintings) / 300 separate images
Printed Text (leaflets, books, letters, document) / 300 pages
Objects / 30 individual objects

2.4 Collection content already digitized surveyed

Provider / Type of object / Quantity / Format and Quality
UB FFM / Text (Books and periodicals) / 12.735 title pages = images
1.500.000 pages = images / TIFF / JPEG 600 dpi greyscale
5% TIFF / JPEG colour 300 dpi
AIU / Archives / 10000 views / JPG
Newspapers / 200000 views / JPG
Photographs / 5000 views / JPG/TIFF
Manuscripts / 500 views / JPG
Medem / Music / 3500 tracks / MP3
Jewish Museum of Greece / No digital sources
Hungarian Jewish Archives / Text, typewritten / 3660 testimonies + 68 pages =
3.800 pages
Ministero per i Beni e le Attivita Culturali
1) Bibliotheca Palatina di Parma / Manuscripts / 80 ms / JPEG 300 dpi
2) Archivio di Stato di Venezia / No digital sources
Jewish Historical Institute Warsaw / No digital sources
The Jewish Museum London / Images (photographs, illustrations, paintings)
2) Text (leaflets, books, letters, document) / 1200 separate images / JPEG 150 dpi RGB colour/Grayscale
Documents / 1200 pages / JPEG 150 dpi Grayscale

3. Metadata Survey and Alignment

3.1 The Survey and Guidelines

The content providers were asked to answer the following questions:

3) What Metadata Scheme is currently applied in your institution?

Does the Metadata Scheme fulfil the minimum set of Metadata Elements

Fill in Chart III – Minimum Set of Metadata

Chart III – Minimum Set of Metadata:

Category / Description
Provider / Name of Institution
Standard Metadata / Do Standard Metadata (like Dublin Core, CDWA, MARC) exist?
If you do not use Standard Metada, please specify if the mandatory categories are filled out.
Object Identification - Title / Basic information about the object:
Library : Title of book
Museum: Name of object
Archive: Title of item or file
Alternative Title
If applicable / Example: original title of translated book: Le vent des Khazars <dt.> (Der Messias Code) by M. Halter
Creator / Author or Editor or Association, community
Contributor / Could also be producer/ manufacturer
Date / Publication or Production Date

Additional Europeana Elements

4) Do you provide additional Metadata?

please specify them in the following chart.

Chart IV Expanded elements - recommended

Coverage / Geographical or periodical definition – similar to subject e.g: photo of Rome
Description / Text: number of pages of book, number of illustrations, size, etc.
Object: Material, colours
Is Part of / Relation - for ex. letter that is part of a collection
language / Language of the text of the digital object
Publisher
Source / Content holder like British Library, Louvre...
The institution that aggregates=submits the data to Europeana is held under provider.
If source and provider are identical, than use provider
Subject / Topic, people, places
Type / Information about type of object – carving, pressing, or material like wood
Record ID / Signature of the book, files
Inventory Number of objects

Additional Elements

Format / File Format of digitised object like text
Extent / = refinement of format, e.g. size of original object
Medium / Material of original object, similar to type
Provenance / History of ownership or custody

5) Which Vocabularies, if any, do you use for Subject description, like LoC Subject Heading, Name Authority Files …?

6) Does your Metadata System have the capability to export files into XML files?

3.2 Results of Metadata Survey

The following chart shows the Metadata that exist at the partners’ institutions referring to the real objects (books, archival material, museum artifacts) in their collection to be digitized. The categories are based on the list of elements defined in the Metadata Mapping & Normalisation Guidelines for the Europeana Prototype, Version 1.2.1, 18/01/2010, Europeana v1.0, chapter 2.1, as referred to on page 5 of this report.[3]

The existing metada will enable an efficient mapping by the content provider to Europeana according to standards.

3.2.1  Minimum Set of Metadata

3.2.2  Expanded Elements

3.2.3  Additional Elements

Category / UB-FFm / AIU
Medem / JM Greece / MAZSIHISZ
Budapest / Parma / Venice / JHI Warsaw / JM
London
Standard Metadata / Dublin Core / Dublin Core / No / Dublin Core
OAI / Dublin Core / No / No / No
Title / X / X / X / - / X / X / X / X
Altern.
Title / X / X / - / - / - / - / - / X
Creator / X / X / X / - / X / X / X / X
Contributor / X / X / X / - / - / - / - / X
Date / X / X / X / - / X / X / X / X
Recommended Elements
Coverage / X / X / X / - / X / X / - / X
Description / X / X / X / - / X / X / - / X
Is part
Of / X / X / X / - / X / X / - / X
Language / X / X / X / - / X / X / X / X
Publisher / X / X / X / - / - / - / - / X
Source / X / X / X / - / X / X / X / X
Subject / X / X / X / - / - / X / - / X
Type / X / X / X / - / - / - / - / X
Record
Id / X / X / X / - / X / X / X / X
Additional Elements
Format / X / X / X / - / - / - / X / X
Extent / - / - / X / - / - / X / - / -
Medium / - / - / X / - / - / X / - / -
Provenance / X / X / X / - / X / X / X / X

3.3 Use of Vocabulary for Subject Description

In order to effectively assess the requirements for the adaption of controlled vocabularies for Judaica purposes, one of the tasks of WP 2, the partners were asked to submit the type of vocabularies that are in use in their institutions.

The result was the following:

Institution / Type of Vocabulary
UB FFM / German Norm Subject Headings RSWK
Authors authority files according to LoC
Partly DDC
Local Systematic Thesaurus
AIU / Medem / Local Subject Thesaurus
Authors authority files taken from Bibliothèque Nationale de France and Israel National Library
JM Greece / UNESCO Vocabulary
Budapest / ------
MiBAC / Parma / ------
MiBAC/ Venice / ------
Jewish Historical Insitute/ Warsaw / ------
Jewish Museum London / In-house system

3.4 Export Formats

UB FFM: Not completely at present