ASIS&T Annual Meeting:

Finding Victoria Gray Adams: Name Authority Control in Digital Collections

Suzanne R. Graham

University of Georgia, Main Library, Cataloging Dept. Athens, GA 30602-1641

Sheila McAlister

University of Georgia, Digital Library of Georgia, Athens, GA 30602-1641

Introduction

Name authority control, or the disambiguating and linking of variant forms of proper nouns (most commonly personal, geographical, and corporate names), is a key tool for search retrieval in any database. As large consortia and other cooperative pools of digital collections dominate the online resources environment, every digital library faces the challenge of providing accurate and interoperative name headings for their items or risk being underutilized or ignored by researchers. Current approaches use a spectrum between separate metadata records and encoded markup within the digitized text to identify and link name access points. The solutions adopted reflect the priorities, resources, and imagination of the program’s staff .

Methodology

Researchers analyzed the authority control practices of a representative sample of digitization programs in the United States, Canada, and Europe to assess the current state of name heading reconciliation and to highlight successful and innovative approaches to this issue.

Results

Many large consortia expect their participants to reconcile variant name forms before they contribute metadata records to an aggregator (be it a web search engine, union catalog or an OAI-based database). Metadata staff search and accept headings found in the Library of Congress-NACO Name Authority File (LC/NACO NAF), discipline-based name files (e.g., the Union List of Artists’ Names) and bibliographic utilities (e.g., RLN and OCLC) or proscribe a specific syntax for local name construction based on the ISO standard or other commonly used representation (e.g., “last, first, dates”).

This practice engenders potentially high costs or high risks. It adds expense to the creation of metadata records by requiring expertise in the creation of name headings. Research and documentation of names may be particularly time-consuming activities that must be handled by specially-trained staff to meet the standards to add headings to the LC/NACO NAF. Alternatively, if the names are not established in national file, then there is the risk that the headings will be useless if the heading selected locally varies from a “standard” form adopted by a larger community.

Instead of accepting names according to one master list for a surrogate record, other digital programs use encoded solutions to address authority control. These automated or semi-automated techniques for name form reconciliation offer promise to institutions and repositories that have the ability and resources to use computer algorithms to link name headings. However elaborate these marching programs are not perfected and usually require some level of human quality control.

Another approach conceptually between the first two requires digital program personnel to embed key codes for particular names into the marked up text. These keycodes link varying headings for the same entity without the identification of a “correct” form of the name. While this method is most culturally sensitive and flexible, it engenders the same expenditures in time and expertise as both the previous practices.

Conclusions

This exploration of current approaches to identify and collocate name variants used by digitization projects in educational repositories provides an overview of current practices and emerging technological solutions in hopes of promoting dialog between proponents of programming-based solutions and adherents to more traditional library name files.

The methods vary in the conceptual approach to tackling the problem. The creation of master lists of preferred names—be it a small local list offered in tandem with the searchable database or a massive national library effort like LCNAF— and name key encoding assumes work to be done at the time of metadata entry or text markup. Reliance on matching algorithms shifts the linking work to the time of the search. None of the methods for name reconciliation addresses all issues in an authority control discussion, nor do they offer a quick fix. All require real time and resource investments to function consistently and satisfactorily for users.