Digital Library of Appalachia Handbook
ACA Central Library
Appalachian College Association
Revised August 2010
Table of Contents
Contributing to the Digital Library of Appalachia
Scope, content, & mission – 3
Selection criteria – 4
Copyright and permissions – 6
Project checklist – 7
Care and handling of materials – 8
How to digitize: Specifications and best practices by format
Digitization terminology – 9
Scanning text – 11
Scanning photographs – 11
Scanning drawings, maps, & other fine-line images – 12
Digitizing three-dimensional objects using a digital camera – 12
Digitizing audio – 13
Specifications table – 14
Saving your master image – 15
Creating access images – 15
Scanning steps – 16
Metadata specifications
Creating file names – 17
Naming files for compound objects – 18
Creating item records using Dublin Core – 20
Description and use of elements – 21
Sample records – 31
Adding your items to CONTENTdm – 35
Ongoing care of your collection
Maintenance – 36
Ideas for promotion of your collection – 36
Assessment of your collection’s usage – 37
Additional readings – 38
Appendix A: Fair Use laws and evaluation tools
U.S. Code definition of Fair Use– 39
American Library Association copyright evaluation tools - 39
When does work pass into the public domain? – 40
Appendix B: Ready reference tip sheets
CONTENTdm required & recommended fields for the DLA – 42
MARC to Dublin Core crosswalk – 43
Contributing to the Digital Library of Appalachia
Scope and content
The Digital Library of Appalachia seeks to provide online access to archival and historical materials related to the culture of the southern and central Appalachian region. The thirty-four member libraries, archives, and museums associated with the Appalachian College Association, known collectively as ACA Central Library, seek to generate interest and encourage continued scholarship for the entire region. Information in the collection exists as reproductions of color or black and white photographs, reformatted typed pages, published books, unpublished manuscripts, personal diaries and correspondence, journal and newspaper articles, musical recordings, oral history recordings and transcripts, and other relevant materials.
Mission of the Digital Library of Appalachia:
- To improve scholarly access to research resources related to Appalachia. Improved access, particularly to primary source material, will strengthen academic offerings in Appalachian Studies. Students, faculty, and researchers will be able to draw upon the Digital Library of Appalachia for authentic information, and thereby gain a greater understanding of the region.
- To virtually bring together research resources which are physically based in numerous geographically remote locations across five states. This unprecedented opportunity for comparison and contrast will foster new learning about Appalachian experience.
- To broaden opportunities for classroom instruction. Faculty will be able to design new or revised courses based on the resources made available through the Digital Library of Appalachia. Likewise, students and teachers in regional K-12 schools may find the Digital Library of Appalachia revitalizes their courses in state and local history and culture.
- To promote participating institutions by showcasing the contributions of their special collections to Appalachian scholarship.
- To help preserve irreplaceable materials by providing quality digital surrogates and thus diminish handling of the originals.
What to digitize: Selection criteria for adding items to your online collection
Putting materials in a publicly accessible digital archive is in fact ‘publishing’ them, and requires the usual quality controls of a good publication. Librarians and archivists at each institution are responsible for using their good judgment and knowledge of their collections in assessing local holdings and choosing items to be digitized.
The DLA Committee has established some general guidelines for selection:
- Materials included in the Digital Library of Appalachia project should be particularly and substantively representative of Appalachian experiences.
- Appalachia, as defined by the Appalachian Regional Commission, is a 200,000 square-mile region that follows the spine of the Appalachian Mountains from southern New York to northern Mississippi. It includes all of West Virginia and parts of twelve other states: Alabama, Georgia, Kentucky, Maryland, Mississippi, New York, North Carolina, Ohio, Pennsylvania, South Carolina, Tennessee, and Virginia.
- Preference is given to materials that reflect the heritage of Appalachian College Association geographic regions, but materials that address other Appalachian areas may be appropriate.
- Preference is given to scarce or unique items in libraries’ special collections that can be defined by one or more of the following categories:
Cultural Landscape
Daily Life and Customs
Education
Literature
Minorities
Music
Natural environment
Politics and government
Religion and beliefs
Visual arts and handicrafts
Work and occupations
- Collections whose use will very likely be increased with digital access should have a higher priority for selection.
- Materials that are known to be already represented in existing digital collections elsewhere outside of ACA schools should not be duplicated in the DLA.
- Materials included may be in any format that the contributing institution can digitize and catalog to recommended specifications.
- Materials may be appropriate for users at any level, including K-12 schools, higher education, and personal enrichment.
- Participating institutions must exercise due diligence with regard to copyright compliance. Staff may refer to guidelines in this handbook for recommendations on fair use, copyright protections, privacy considerations, and ownership rights. Please see Appendix for information about Fair Use.
Additional questions to consider
In addition to the specific parameters set above, you might find it helpful to ask yourself the following questions. If you can answer most of these definitively and positively in regard to specific materials, then those items are probably worthy of your time and effort spent on digitization.
Why do you think the item is representative of Appalachia? In what context?
Who is the intended audience for the item or collection? Will the material be of interest to the general public or to a specific audience?
Do you think that digitizing the item will help meet a demand for access to that content? If it actually increases demand for the originals, are you prepared to handle those requests?
Will digitization provide better documentation of the item than is currently available?
Do you hold copyright on the item? If not, can you contact the copyright owner for permission to digitize?
Can the item withstand the handling required for digitization?
Will digitizing the item provide content that could be used for educational purposes, either at the K-12 or college level?
Copyright and Permissions
We encourage Fair Use of these materials under current U.S. Copyright law and accompanying guidelines. Collections of the Digital Library of Appalachia (DLA) are made available for non-profit and educational use, such as research, teaching and private study.
For these purposes, end users may reproduce DLA materials (print, download or make copies) without prior permission. Users must obtain written permission from the owning repository or rights holder before using a particular item for other purposes, including publication or other commercial applications. The owner of each item in the Digital Library of Appalachia is identified in the "Holding Library" field of the item record. Requests for permission should be addressed to specific holding libraries. Contact information should be provided in the “Rights” field on the item record. We recommend having requests routed to a general library contact email, if possible, where it can then be forwarded to the appropriate person.
Please see Appendix A, “Fair Use,” for guidelines and links to tools that can help you assess your selected items.
For more information, please see “Image Rights Options – Banding, Branding, and Watermarking” tutorial in the CONTENTdm Tutorials folder on your DLA flash drive.
Project checklist
Your project will run much more smoothly if you follow a consistent routine. Here are the major steps to consider in completing your projects (all of which should be addressed in this handbook):
Preparation- Establish project goals
- Select items or collections
- Make a good faith effort to research any copyright limitations
- Train staff in handling and scanning of materials
- Define your work space
- Organize and prepare the materials for digitization
Digitization
- Digitize items according to recommended specifications
- Save digital items using DLA file naming procedure
- OCR text-based items (or re-key by hand)
- Proofread OCR text
- Create access images from master
- Backup TIFF images by placing copy on external hard drive
- Inspect digital items for quality
Metadata
- Upload digital items into CONTENTdm Project Client
- Complete metadata records for each item using DLA guidelines
- Review records, then upload to CONTENTdm Administration
Publication
- CONTENTdm administrator for your collection reviews uploaded items
- Administrator makes any necessary edits and approves items that are ready to be published in the online collection
- Item is published online!
Maintenance & assessment
- CONTENTdm administrator for your collection periodically reviews collection to ensure stability and accuracy of records
- CONTENTdm administrator regularly evaluates collection user statistics:
1. How many times the collection is being accessed in a set period of time
- What items are being accessed most frequently over time
- What format of items is being accessed most frequently over time
- What search terms are being used most frequently
Promotion
- Actively promote collection through online resources (links, sample images on sites like Flickr or Wikimedia, social media tools)
- Create flyer or postcard that highlights key features and informs users how to access the collection; distribute among faculty, students, & area cultural/educational organizations
- Consider writing an article for school or local paper, newsletters for school departments and local organizations, journals
Care and Handling Procedures during Digitization for All Collection Materials
You may be fortunate enough to have students or other staff to assist with your project. Item selection and item record creation is best left to the librarian or subject specialist, but others can be trained to help with scanning. Review the scanning specifications, file naming conventions, and step-by-step instructions with your assistants, and then guide them through a few practice scans. Here are some common sense rules that your assistants will want to be aware of when handling special collections materials and scanning equipment (adapted from handling procedures written by Yale University Libraries -
The following points provide guidance for handling most collection materials:
Be observant, careful, and use common sense.
No food or drink at work spaces.
Wash hands before handling collection materials, especially after eating, to make sure they are clean at all times.
Do not use collection materials as a writing surface.
Remove paper clips, pins, and string carefully.
Do not use pens, markers, or sharp objects near collection materials.
Keep surfaces clean and uncluttered.
Do not place objects on top of collection materials.
If collection materials are to be stacked, limit stack sizes and heights.
Do not place items on the floor, near windows, or on radiators.
Do not loosen or unbind any book.
Cease the scanning process if materials show signs of stress from handling.
How to digitize: Specifications and best practices by format
Specifications here are informed by the practices of BCR (BibliographicalCenter for Research), University of North Carolina, Library of Congress, Research Libraries Group, and Institute of Museum and Library Services (see “Additional Reading” for citations).
For text and image materials, specifications are for creating master images in TIFF format, from which CONTENTdm will automatically generate a lower-resolution access image in JPEG format.* It is recommended that master images be unaltered and stored for archival purposes, for the production of print materials, for in-house consultation, and for the creation of derivative images when needed.
* Quality of the automatically generated JPEG images may vary, so you may wish to generate your own JPEGs from the master files for comparison. See “Creating access images” at the end of this section for details.
Terminology
Here are a few terms and phrases that will be used in this section:
Bit: Short for “Binary digit,” a bit is a unit of measurement that determines the depth of color information stored in a scan. The greater the bit-depth, the greater the variety of shades that can be represented in a scanned image. Greater bit-depth also increases file size, however, and so we recommend adhering to the specifications below to generate master images that are high quality, yet not too large.
Grayscale: Provides a range of gray shades in an image, and delivers a finer scan quality than a black and white scan.
JPEG: Stands for “Joint Photographic Experts Group.” JPEG is an image file format that is most useful for creating images with file sizes small enough to display online or send electronically, while retaining decent image quality. In the DLA we refer to the JPEG as the “access image,” because it is this format that displays in the online collections. When a TIFF image is uploaded into CONTENTdm, a JPEG “access” image will automatically be generated from that larger “master” image.
OCR: Stands for “Optical Character Recognition.” When a document is scanned as a TIFF or JPEG, it creates an image file. The text can be read by the human eye, but cannot be searched by keyword. OCR software recognizes letters and generates a text document to accompany the image. These automated transcriptions are rarely perfect, but with some supervision and editing, the searchable transcript in combination with the accompanying image can create a multipurpose digital item.
Pixel: Stands for “picture element.” A pixel basically carries a piece of information about the scanned image.
PPI: Stands for “pixels per inch.” The more pixels that populate an image, the more detailed that image will be. More detail also translates into a larger file size, and at some point, the detail in an image can exceed what is necessary for any archival purposes or for what can be detected by the human eye. The specifications recommended below will generate master images at a sufficiently high quality.
Resolution: A grid pattern into which the original image is segmented. The number of pixels per inch determines how fine, or detailed, the grid will be. The higher the resolution, the more detailed the grid.
RGB Color: RGB stands for “Red, Green, Blue,” the basic colors that can be combined in various ways to produce a broad spectrum of colors. How many shades of each color, exactly, depends upon the bit-depth. The higher the bit-depth, the greater the range of shades that can be extracted in your scan. For most of our projects, the recommended setting for RGB color is 24-bit.
TIFF: Stands for “Tagged Image File Format”. TIFF is a high-resolution file format that is generally accepted as the standard for archival image files. TIFF files are generally very large, because they are scanned at a high degree of detail and image quality. Therefore, they cannot easily be downloaded from a website or other point of access, but they are valuable for creating high-quality “master” images from which smaller derivative image files, such as JPEGs, can be created for more convenient access.
Specifications
The TIFF specifications below are intended for items measuring between 5.5” – 11.5” on the longest side. For larger items, decrease the resolution to 300 ppi. For items between 1.5” – 5.5”, increase the resolution to 800 ppi. For items measuring less than 1.5” on the longest side, see page 52 of “Best Practice Guidelines for Digital Collections at the University of Maryland Libraries” in “Additional readings.”
A summary chart is provided at the end of this section.
For more information, another very good and approachable primer on digital imaging is available at
Scanning text
Scan at 300 ppi, or with spatial dimensions at 4000 pixels across the longer dimension.
To scan an item using Optical Character Recognition software (see below), scan at 400 ppi. Scan in 8-bit grayscale, or in 24-bit color where it is important to the representation of the document. Save as TIFF image.
When scanning text documents, spatial resolutions should be based on
the size of text found in the document, and resolutions should be adjusted
accordingly. Documents with smaller printed text may require higher resolutions
and bit depths than documents that use large typefaces.
Optical Character Recognition (OCR)
Basically, OCR software “reads” a scanned text document to generate a fully searchable transcript. See the ABBYY Finereader guide on your DLA flash drive for detailed instructions.