The Example of DP4lib

A Cost Model for a Long-Term Preservation Service

Table of Contents

Table of Contents

Introduction

The Project: Digital Preservation for libraries

The general cost model

Cost model overview

Cost Types

Cost Elements

Indirect and direct costs

The Example of DP4lib

Implementation oft the Cost-By-Service Cost Model

Cost Types and Cost Elements of the LTP-Service

Distribution Keys

Evaluation

Appendix

1.Cost-By-Service Cost Model of the Project DP4lib

2.Cost-By-Service Cost model: Ingest-Service

3.Distribution Keys for the cost model of the project DP4lib

Bibliography

Introduction

The determination of the actual costs of a long-term preservation (LTP) system is a difficult task. That is, among other things,due to the factthat most of theLTP-systems for whichthe cost modelsmust be built, are still largely intheconstruction or testingphase. Therefore little practical experiences consist of running these systems and, consequently, with evaluating their corresponding cost models. Likewise, it is very difficult to compare cost models, and to judge if they are different or equal with each other. Each cost modelmust ultimatelybe tailored toa particularsystem if it is to be used in budgeting, accounting and charging of a LTP-system. This situationalso applies totheproject DP4lib in whichservicesfor digital preservationshouldbe developedand offered for third parties.

This article is divided into two major parts. The first part presents a general cost model for IT services.Subsequently,an adaptation of thegeneralmodel to theindividual DP4lib-LTP-systemis made in the second part.But first of all a brief overview of the project DP4lib is given.

The Project: Digital Preservation for libraries

The project Digital Preservation for libraries was funded by the Deutsche Forschungsgemeinschaft (DFG). The goal of this project was to evaluate the feasibility of all options for establishing and running a ready-to-operate service for long-term preservation (LTP). As an additional project task, the preceding conceptual work was implemented in an initial LTP business process. The starting point of all the work in the project was the development of a requirement catalogue. All requirements that are considered mandatory from the customer´s point of view for service-based LTP have been included in this catalogue.

Following the requirement specification, a service-portfolio was developed, which includes technical-based services (i.e. hardware and software interfaces), LTP specialized services (i.e. technical metadata generation or SIP processing) as well as management services like reporting an incident-management services.

The resources and funds required to set up and operate this service-based LTP-system are discussed in this article.

The general cost model

IT Services are usually viewed as critical to the organization. The increased demand for new technologies and the complexities of new decentralized systems have frequently caused IT services costs to grow faster than other costs. As a result, organizations are often unable or unwilling to justify expenditure to improve services or to develop new ones.

Due to the complex nature of Accounting for IT usage, it is rare that the actual running costs of the IT services are properly identified and this often leads to dissatisfaction with the perceived added value from the service. To address this dissatisfaction, it is quite common to implement IT Accounting and Budgeting processes and often to implement Charging processes as well.

As a basis for each process, a cost model is created, and the level of detail of the cost model depends on the process to be supported. Starting from Budgeting, the level of detail greatly increases but also the costs of the cost model. This leads to a dilemma: The more complex the cost model is the more costly and cumbersome is it to set up and operate the cost model itself.

Cost model overview

In order to calculate the costs of a service provision and operation, it is necessary to design a framework in which all known costs can be recorded and allocated to a specific activity, customer or other categories.In the following, this is called a cost model.

In general, various types of cost modelscan be distinguished:

  • Cost-By-Customer,
  • Cost-By-Service and
  • Cost-By-Location

In the firstcase, the aimof the cost modelis to assignthe coststo a specific customer. In the second case, the costs are recordedthat are requiredfor operationof a particular service. In thelatter case,the cost modelshouldbe used todetermine the costof a branch office.This article concentrates on a Cost-By-Service cost model that enables the calculation of all costs for a specific service: The LTP-Service.

Cost Types

Estimating the costs for Budgeting, Accounting or Charging items, it is useful to categorise costs into cost types to ensure that they are correctly identified and managed. This categorisation should use the criteria of consistency and comprehensibility. Typical resulting cost types are:

  • hardware costs
  • software costs
  • people costs
  • accommodation costs
  • External Service costs and
  • transfer costs

Perhaps the last two points do not meet at once the categorisationcriteria for cost type, which were mentioned above. They need some further explanation. Today it is common to buy in services from external providers (external services). In most times cost elements of this cost type category represents a mixture of cost types, see for example a provision of a datacenter. It may be difficult to break down this cost into the self-imposed cost categories, because there could be elements that are indivisible or the supplier will not wish to go in detail.In these cases it is much easier to categorise this as an External Service cost category.

Transfer costs are those that represent goods and services that are sold from one part of an organization to another one. Transfer costs should be visible in the cost model because people may forget that internal goods and services are part of the cost of providing services.

Cost Elements

For budgeting use it may be sufficient to subsume all costs to the categories only. If more detail is required in calculating costs the chosen major cost types can be further divided. For instance hardware might be divided into network or servers. The purpose of this is to ensure that every cost identified in the organization can be placed within a table of costs, by type.

Typical cost elements within a major cost type are shown in the following Table:

Table 1: Cost Elementsexamples

Cost Types / Cost Elements
Hardware / Central processing units, LANS, disk storage, peripherals, etc.
Software / Operating systems, scheduling tools, databases
People / Payroll costs, benefit cars, etc.
Accommodation / Offices, utilities
External Service / Disaster Recovery Services, outsourcing services
Transfer / Internal charges from other cost centers within the organisation

Be determining the necessary level of detail of a cost model the organization is faced with a cost-benefit- analysis. All the times, when performing a detailed cost analysis it is essential to ensure that the value provided by the answers is not outweighed by the cost of data collection and analysis. For most business driver cost models, apportionment should be firstly simple, secondly fair and thirdly accurate (as far as possible).

Indirect and direct costs

The cost-by-service cost model requires that all major cost elements are identified and then attributed to the service that ‘caused’ them. To do this, the costs first have to be identified as direct or indirect:

  • Direct costs are clearly attributable to a single service
  • Indirect costs are those incurred on behalf of all or a number of services, i.e. Ingest and Access which have to be apportioned to all or a number of services in a fair manner.

Furthermore, any indirect costs, which cannot be apportioned to a set of services have then to be recovered from all services in as fair away as is possible.

Figure 1 shows an overview of the introduced cost model.

Figure 1: Cost model for a Cost-by-Service approach (from

The Example of DP4lib

Within the project “Digital Preservation for libraries” a service portfolio was generated, which consists of more then a dozen large and small services for long-term preservation and controlling activities.

As specified above, a separate cost model has to be built for each service. By adding up all the costs of each service, the total costs of the service portfolio is obtained. For a first approximation this approach is much too detailed and inefficient. A cost-benefit analysis should always determine the necessary level of detail.

To be able to create a cost model for the operation of a long-term-preservation service the LTP-service portfolioof DP4lib were distributed into three main services: Ingest,Curation and Access. The OAIS[1] reference model has been used to share in the three main services. The result of the distribution is graphed in Figure 2.

Figure2: Distribution of all services into three main services based on the OAIS reference model
(CCSDS 2002, page 4-1)

In the following themain services are shown and their respective sub-services were briefly explained:

  1. Ingest
  • Reception of the objects:

The reception of objects includes both the unpacking of the material and an integrity check. In addition, also a role check (a peculiarity of DP4lib) is performed, which verifies whether if this material is relevant for the DNB mandatory delivery.

  • Metadata handling:

A classification is made according to the agreed Ingest-Level. In addition, technical metadata are generated and style properties of digital objects are validated. The supplied descriptive metadata are processed and made available in the databases of the long-term archival system.

  • SIP handling:

The supplied transfer packages are processed to UOF SIPs. Objects supplied as UOF SIPs are already prepared for import into the LTP system.

  • Reporting and Protocol Management:

The entire ingest process is documented in detail. All messages to the employees, such as error logs, ingest protocols and ingest reports are created, transmitted and stored.

  • Storage of Objects:

The prepared UOF SIPs are sent to and stored in the storage system. Only after acceptance of the ingest report by the employees the process can be terminated.

  1. Curation
  • Digital Lifecycle Management:

The digital objects must be maintained and observed during their period of use. This includes all contractual activities as well as a comprehensive technology watch and maintenance.

  • Conservation activities:

Currently, only migration scenarios are intended. But at the moment the costs for a migration of documents are not included in the cost model. Even an estimate of the cost is considered inexpedient. This is first created at the time of the order.

  • Integrity check and conservation:

The checksum of the objects are checked regularly to ensure the integrity of the digital objects in the long term. If necessary, corrupt objects are replaced by valid objects that are restored from the redundant backups. Regularly reports and protocols are created and sent to the employees about these activities.

  • Retrieval (Search and Access):

To access the objects, an intrinsic retrieval is also provided.

  1. Access
  • Authentication:

When accessing the system the authentication mechanisms ensures that the respective employees get exclusively access to those objects they are entitled to use.

  • Search:

A search interface is provided to the employees, about the data to which they have authorized access.

  • Retrieval:

The digital objects are offered through a retrieval interface.

For each of these three main services a Cost-by-Service cost model was established. The result of the considerations made on this basis, is an Excel spreadsheet (cf. Appendix 1, 2), which is designed to support service providers in calculating the costs of this special DP4lib-LTP service.

Implementation oft the Cost-By-Service Cost Model

The following shows the implementation of the Cost-by-Service cost model. Here, both construction costs as well as operating costs for the LTP-Service were taken into account.

Cost Types and Cost Elements of the LTP-Service

The first step is now to identify all cost elements and divide them into consistent cost categories.The cost elements in the field of LTP services can be divided into the cost types of hardware, software, employment, accommodation and external services. Transfer fees are charged in this case in other existing cost elements. For classification of cost elements into cost categories it is to point out that the following cost elements based on the characteristics of the project DP4lib. I.e., the list makes no claim to completeness, what possible cost elementsof LTP-system concerns.

In the following all identified cost elements are listed. Similarly, some economic aspects are discussed with regard to their depreciation and possible distribution of keys.

Hardware

As cost elements could be identified:

  • Server:

Servers are required for receiving transfer packets from the service users and for the return delivery of archived digital objects to the services users. Sufficiently dimensioned servers for databases are also covered by this point.

  • Disk Cache:

A sufficiently dimensioned disk space on the servers is required for the adoption and intermediate storage of large data sets.

  • Storage (200 LTO5 tapes per 1 TB):

100 TB were purchased for the project DP4lib. For this purpose, the procurement was adopted by 200 LT05 tapes incl. backup.

  • 200 slots within a storage cabinet:

For every LT05 tape a slot within a storage cabinet is rented.

  • 2 Tape Drives:

Two tape drives are required for the operation of the LTP-service.

  • Re-configuration of the DIAS system:

This includes the selection and renewal of all necessary system components, their installation and configuration as well as their connection to the local infrastructure.

In calculating the costs for this cost category the depreciation of these assets need to be considered.Depreciations are appliedto capture the consumption in non-current assets. To be able to make depreciations, details of the depreciation value, the depreciation period and the method suitable for determining the rate at which assets are used up are required.

The acquisition costs are often used as depreciation value. For the purpose of the maintenance of assets, it is however also possible to deviate from them. This should especially happen in cases where it is predictable that the prices of assets increase or decrease over time.In these cases the amount is used as a depreciated value which is required to obtain at the end of its useful life a new asset.

The depreciation period describes the operational useful life of an asset.Empirical valuesof the manufacturers or depreciation tables can be used to determine the useful life span. The choice of the depreciation method should be such that the depreciation corresponds to the actual consumption as far as possible. If cost calculation is applied for costing purposes, linear depreciation methods are advantageous.The formula to calculate the depreciation value is as follows:

With:

at =Depreciation in period t

A= Acquisition cost

Run= Residual value at the end of its useful life

n= useful life

Software

Here, costs arise mainly by license fees.The following fees must be taken into account:

  • Database license:

This includes all database licensing costs needed for upstream and downstream processes of the LTP services. At the moment there are no costs on this point, but this will change in the near future.

  • Online Analytical Processing (OLAP)-System:

When information must be given to the management and also to the administration, then the assistant of an online analytical processing (OLAP) system may be useful. I.e. the number of accesses to the archive, trends, or the number of file formats could provide meaningful information for the management to refine the services. License fees and if so the support must be included here as costs.

  • IBM licenses (DIAS-Software):

Cost of operating a DIAS mandator for a period of 2 years.

  • IBM-License (Standard software):

Under this item, costs incurred for the basic operation of the DIAS system are summarized.

  • Email-Ticket-System (JIRA):

Expenses incur for an internal and external ticketing system.

  • Monitoring (i.e. Nagios, Logging etc.):

This includes selection, installation and maintenance of monitoring software for the LTP service. For all cost categories, monitoring systems are possible.

Employment

The following staff is required for the operation of the LTP service[2]:

  • Technical Support:

The skills needed for technical support include: Java, Linux, web applications, databases, client-server applications, First-Level-Support.

  • Manager:

A manager of a LTP service provider requires among other the following skills profile: customer service, customer acquisition, requirements management, Service Level Agreements, contract negotiations.

  • Assistant:
    Responsible for daily operations as well as for the first escalation level.
  • Developer:

For further development and maintenance of the LTP system, a developer requires knowledge in: Java, Web-Services, client-server applications, scripting, Second- und Third-Level-Support, Unix, Windows, databases (MySQL, Oracle).

  • Officer:

The profile of an officer includes: Experience in long-term archiving, well-founded knowledge of the library science, First- und Second-Level-Support.

Accommodations

Cost components in this area are:

  • Office for 5 employees:

For employees included in section "Employment", office as well as office furniture, material, etc. must be made available. In addition, also the respective workstations as well as the licensing costs for the respective operating systems must be subsumed.

External Services

Also external services are needed for the LTP services. These include: