NESCent Workshop on Building a Digital Data Repository for Evolutionary Biology
December 5, 2006
Metadata Concepts and Schemes
DEFINITIONS from the Dublin Core Metadata Initiative (DCMI) Glossary:
1. Extensible Markup Language (XML)
A subset of Standard Generalized Markup Language (SGML), a widely used international text processing standard. XML is being designed to bring the power and flexibility of generic SGML to the Web, while maintaining interoperability with full SGML and HTML. For more information, see
Resource
2. Metadata
"Data about data;" functionally, "structured data about data." Metadata includes data associated with either an information system or an information object for purposes of description, administration, legal requirements, technical functionality, use and usage, and preservation. . In the case of Dublin Core, information that expresses the intellectual content, intellectual property and/or instantiation characteristics of an information resource
3. Ontology
A hierarchical structure that formally defines the semantic relationship of a set of concepts.
4. Resource
A resource is anything that has identity. Familiar examples include an electronic document, an image, a service (e.g., "today's weather report for Los Angeles"), and a collection of other resources. Not all resources are network "retrievable"; e.g., human beings, corporations, and bound books in a library can also be considered resources
5. Taxonomy
systematic classification according to principles or general laws. In digital terms, automated classification of documents in a hierarchy based on information gathered by a metacrawler. May refer to a classification of DCMI terms. A classification system such as Library of Congress Classification is an example of a taxonomy.
Metadata Schemes
1. A-Core (Administrative Core):
“Metadata about metadata - referred to as the A-Core - is useful to designate information about the provenance, management or administration of other sets of descriptive metadata.”
2. DCMES (Dublin Core Metadata Element Set):
“The Dublin Core metadata element set is a standard for cross-domain information resource description.”
3. DDI (Data Documentation Initiative):
“The Data Documentation Initiative is an international effort to establish a standard for technical documentation describing social science data… a rich and complex standard that can describe the data at the element level as well as the project level.
4. EAD (Encoded Archival Description):
“The EAD Document Type Definition (DTD) is a standard for encoding archival finding aids using Extensible Markup Language (XML).” This schema is capable of describing a collection, as well as individual items.
5. EML (Ecological Metadata Language):
“Ecological Metadata Language (EML) is a metadata specification developed by the ecology discipline and for the ecology discipline. It is based on prior work done by the Ecological Society of America and associated efforts (Michener et al., 1997, Ecological Applications).”
6. FGDC/ CSDGM (Federal Geographic Data Committee/ Content Standard for Digital Geospatial Metadata):
“… a common set of terminology and definitions for the documentation of digital geospatial data. The standard establishes the names of data elements and compound elements (groups of data elements) to be used for these purposes, the definitions of these compound elements and data elements, and information about the values that are to be provided for the data elements.”
7. LSID (Life Science Identifier): ftp://ftp.omg.org/pub/docs/dtc/04-10-08.pdf
The LSID is a specification that addresses standardized naming schemas for entities in the life sciences.
8. MARC (MAchine Readable Cataloging)Bibliographic Format:
A communication standard for the machine readable representation of bibliographic data.
9. NBII (National Biological Information Infrastructure):
“…a broad, collaborative program to provide increased access to data and information on the nation's biological resources. The NBII links diverse, high-quality biological databases, information products, and analytical tools maintained by NBII partners and other contributors in government agencies, academic institutions, non-government organizations, and private industry.”
10. ODR (Open Digital Rights Language):
“The ODRL/DCMI metadata usage Profile will document how to make combined use of the rights-related DCMI metadata terms and the ODRL rights expression language.”
11. PREMIS (Preservation Metadata Implementation Strategies):
“…a core preservation metadata set, supported by a data dictionary, with broad applicability across the digital preservation community. Identify and evaluate alternative strategies for encoding, storing, and managing preservation metadata in digital preservation systems.”
12. TEI (Text Encoding Initiative) Header:
Standard used to describe text and the formats within a text to assist with digital processing.