Progress Report on the OASIS Legal Citation Markup Technical Committee

Progress Report on the OASIS Legal Citation Markup Technical Committee

Progress report on the OASIS Legal Citation Markup Technical Committee

John Joergensen, Rutgers University

Introduction:

In 2014, a group of legal technologists came together to form the Legal Citation Markup Technical Committee of OASIS. OASIS is an international standards organization that promotes the formulation of open technology standards. The purpose of the new committee (called LegalCiteM TC) is to formulate a general markup standard for legal citations, useable by governments, courts, and outside parties to insure interoperability and open-ness in legal citation. The LegalCiteM TC is affiliated with LegalXML, which is addressing markup of legal documents more generally, concentrating on the refinement of the Akomo N'toso legal markup standard.

The TC was convened by John Joergensen (Rutgers University, USA), and is currently chaired by Fabio Vitali (University of Bologna, Italy) and Melanie Knapp (George Mason University, USA). We are well into our work, and are in the process of refining many of the requirements that such a markup system requires. This presentation is a report on what the TC has done so far, why it is taking the directions it is taking, and where we are headed.

Why Citation Markup:

On a general level, in order to have access to any law from any source, you need to have enough information to know where the law can be located and how to find it. This information is what is conveyed by citations. However, as anyone who has tried to create cross-links between documents originating from different sources knows, creating reliable and recognizable links between sources in a heterogeneous environment is very difficult. URL's and URI's for material is rarely something that can be recognized or predicted in such a way that links to the material can be easily derived.

Ex.

The example of a Google link to a U.S. court decision is typical of the coding one sees. The difference between the citation as it appears in the print and the URL of the link to the document makes it so that there can be no way to derive the link from the citation.

The solution to sustainable interoperability is a metadata standard for citations that can act as a base-line reference that can be used by any court, legislature, or information vendor to describe links to other documents. Such a standard would facilitate resolver programs that can translate this standard citation into a particular link to an instance of the cited document.

The goals and limitations of a standard for legal citation markup are inherent in the history and nature of legal citations. Legal systems are different between countries, but more significant are the variations in references to what is essentially very similar material. For example, a court decision may be published in a collection or reporter series, and be cited as to its volume and page number within the publication. It may also exist as a pre-publication document as issued by the court. It may also exist online as yet another format. In each of these instances, a reference to those documents may be radically different. A pre-publication court decision is typically referred to by some combination of references to its case identifier (docket number, etc.) and the date of issuance (as well as jurisdiction and court). A published decision by volume, work, and page number. An online edition by its own identifier, or a universal citation such as those used by most of the LIIs. Each one of those citations, however, refers to the same work.

Most, but not all, make specific reference to a manifestation of that work. And, although a typical reader or information provider may be interested in the manifestation, they are typically not. What they want is a reliable manifestation of any kind, either via a trusted vendor or other source.

Another important aspect of a citation metadata standard is the facility with which it can be derived from existing text citations. This is essential to facilitate the majority of legal material that may be processed, both past and present, without having the necessity of manual coding. An important part of the LegalCiteM standard is therefore to stick to information that is either present in, or can be reliably derived from the content of existing print citations, and from the context of the document generally. For example, a citation of the form:

Opperman v CCMA and Others (C530/2014) [2016] ZALCCT 29 (17 August 2016)

As a universal citation, this decision can be quickly and uniquely identified as the 29th decision issued in the year 2016, out of a court identified as ZALCCT. The context of the document quickly lets us know that this is a South African court decision, as it was taken from from SAFLII's collection of South African decisions. Further, some programmable knowledge of the abbreviation conventions of SAFLII also allow us to infer that this is not only South African, but from the Labour Court of Cape Town. The caption (Opperman v CCMA), docket number (C530/2014), and specific date of issuance are useful, but unnecessary in the current context.

A different example, however, gives us a different set of issues:

N.A.A.C.P. v. Township of Mount Laurel, 456 A.2d 390 (N.J. 1983)

In this case, there is a decision which may or may not be completely uniquely identified by a volume and page number of a publication identified by “A.2d”. The context of seeing that decision would identify this as a United States decision. The “N.J.” parenthetical would further identify the court as the Supreme Court of New Jersey, as the year would specify the date of issuance.

In FRBR terms, the N.J. citation is to an expression of the work, and the South African citation is to the work. However, both can be used to resolve a copy based on the information either present or derivable from it, and our standard is being designed to accommodate this. More important, however, is the fact that the N.J. citation is also slightly ambiguous: it is possible that more than one decision can start on a given page in a reporter publication, and it is also possible that a case a given caption may have more than one decision issued in a given year. The odds of this are somewhat small, but not zero, and the ambiguity cannot be eliminated altogether. For this reason, a workable markup standard for citations needs to tolerate some level of ambiguity. It will then be for those who will be writing resolvers which use the standard to decide how to deal with it.

The work of LegalCiteM is part of the larger LegalXML group of OASIS and our work is intended to work directly with those efforts. Generally, LegalXML is working on the expansion and refinement of the existing Akoma N'toso legislative and legal information markup standard. However, as can be seen, the citation standard can be implemented independently of the other parts of LegalXML's work.

Another standard that needs to be considered is the European Legislative Initiative [ELI] standard. ELI is a project sponsored by the European Council. It is a standard for official online legislation publishers, and is recommended for use by publishers of legislation in all EU member countries. It facilitates information exchange between different legal systems.

The ELI initiative is essentially different in that is is a set of identifiers for legislation and parts of legislation, but it is not, in itself, a citation. The Oasis LegalCiteM TC standard is intended to apply to citations to documents. So, while a standardized identifier can, in fact, be used as a citation, this is not always the case. In addition, it should also be noted that the LegalCiteM standard is intended to facilitate the conversion of old, legacy print citations into a form that can be used by machines to perform automatic resolution of the citation into an existing instance of a document.

Current Progress

Currently, a great deal of work has been completed on specifying the citation requirements of both legislation and court decisions. There are sub-committees also dedicated to administrative materials, regulations and secondary sources, but those committees are still at work on their use cases. Work is now going forward in the technical sub-committee to finalize the needs of court decisions and legislation,

NOTES and References:

ELI:

Catherine Tambone's PPT presentation of 04 June 2014. for the UK National Archives:

ELI Homepage:

John Dann, notes on ELI:

Fabio Vitali, The CEN Metalex Naming Convention:

Fabio Vitali, The Akoma N'toso Naming Convention:

LegalCiteM TC Courts Subcommittee Use Case Document: