Appendix: Visit Reports

Appendix: Visit Reports

A Comparative Analysis of Data Center

Workload / Staff Effort Metrics

For the NSIDC DAAC

A Spin-Off of the Best Practices / Benchmark Study

September, 2001

G. Hunolt, SGT, Inc.

Funded by NASA Contract NAS5-00154

Outline

NSIDC DAAC Specific Cross-Site Analysis

1.0 Introduction

2.0 Context for the NSIDC DAAC Report

2.1 Goal

2.2 Approach

2.3 Data Center Reference Model

2.4 Reference Epoch

2.5 Selection of Sites for the Study

2.6 Sources of Information for the Study

2.7 Mapping Sites to the Data Center Reference Model

2.8 Normalization of Metrics

2.9 Selected Study Sites

3.0 Overall Normalized Workload vs Staff Effort Metrics

3.1 Characterization of the NSIDC DAAC and Comparison Sites

3.2 Sites’ Work Profile

4.0 Selected Functional Area Comparisons

4.1 Archive

4.2 Access and Distribution

4.3 User Support

4.4 Internal Support

5.0 Conclusions

Appendix I - Definition of Standard Metrics Set

Appendix II - NSIDC DAAC Site Summary

Appendix III - NSIDC DAAC Site Detail

NSIDC DAAC Specific Cross-Site Analysis

1.0 Introduction

This version of Section 5 of the original report, “ESDIS Data Center Best Practices and Benchmark Report”, September 28, 2001, prepared for the PODAG (the NSIDC DAAC User Working Group) presents an NSIDC DAAC specific cross-site synthesis of workload vs staffing metrics and study conclusions, points of caution, and recommendations. It draws on the material contained in Sections 1 and 2 and the appendices of the original report exactly as does the original report’s Section 5.

Normalized metrics relating staff effort and workload will be defined, for an overall view and separately for each functional area of the reference model defined in Section 2 of the original report. A table of comparative values will be presented, and conclusions and recommendations will be drawn from the comparison. Areas where improvements in cost effectiveness might be pursued by SOO and the DAACs will be indicated and discussed.

While the reader may refer to the original report, for convenience Section 2.0 Context for the NSIDC DAAC Report reproduces the portions of the original report that introduce the workload vs staffing metrics analysis. Readers familiar with the original report can skip the Context section.

Section 3.0 presents and compares overall metrics, Section 4.0 extends the comparison to the individual functional areas, and Section 5.0 presents conclusions, cautions, and recommendations.

Appendix I presents definitions of the standard set of metrics used in making cross-site comparisons, Appendix II presents the NSIDC DAAC site summary contained in Section 4.14 of the original report, and Appendix III presents the detail for the NSIDC DAAC contained in Appendix D Section 14 of the original report, with some additional snippets included for PODAG that were not part of the published version.

2.0 Context for the NSIDC DAAC Report

2.1 Goal

The goal of this study is to assess the reasonableness of the staffing levels of the EOSDIS DAACs for the work that they do, and if possible to identify areas where there may be potential to improve productivity.

2.2 Approach

The approach taken to meet the goal stated above is a comparative analysis of workload

vs staffing for a set of eleven working data centers and three of the eight EOSDIS DAACs. The analysis will show whether or not DAAC staffing for workload is comparable or consistent with staffing for workload at the other data centers. In cases where it is not, a closer look may reveal a genuine potential for improving DAAC cost effectiveness, or may show that the difference is explained by site or mission specific conditions that are facts of life.

In order to perform a comparative analysis, a set of workload vs staffing metrics that are normalized across the study sites and DAACs must be developed. These metrics are the first order input to the analysis; the information from which they are developed are a second order source of information when analysis of the normalized metrics identifies a line of further study.

The development of a set of normalized workload vs staffing metrics is not a straightforward task. While the overall work performed by the fourteen data centers (eleven study sites and three DAACs) is generally similar, each site has its own approach to organizing itself to perform the work, to describing the work and measuring performance, describing positions and mapping of work to positions, and many etc. In addition the study sites span a wide variety of sizes and scale of activity, and not all sites perform all functions (e.g. some sites may not archive data, while others may not generate products). The next sections describe how these problems have been addressed.

2.3 Data Center Reference Model

The first step was the outlining of a simple reference model of a “data center” in terms of a set of functional areas that taken together comprise the range of functions that a data center performs. Because the goal of the study calls for comparison of a number of data centers with NASA EOSDIS DAACs, and because the results of the study will only be useful to the SOO and the DAACs if they have meaning in their context, the reference model is by design consistent with the DAACs.

The following are working definitions of the functional areas that make up the data center reference model:

Ingest - the process of receiving, reading, quality checking, cataloging, of incoming data to the point of insertion into the archive. Ingest can be manual or electronic with manual steps involved in quality checking, etc.. Incoming data can be from external sources or internally generated.

Processing - the generation and quality checking of new derived data products from data or products that have been ingested, or previously generated, generally on a routine, operational basis. The DAACs for the Terra mission and other cases, and other sites may receive the software that embodies product generation algorithms from developers from outside of the DAAC or site (e.g. Terra instrument teams for the DAACs or GFDL for NCEP) who are responsible for the initial delivery and for delivering updated versions. Support provided by the data center for integration and test of this ‘science software’ is included as an activity under processing. In cases where a site develops algorithm software, that effort (i.e. development, integration, and test) is included under processing.

Archive - the insertion of data into archive storage, and handling and preservation of data, metadata, and documentation within a site’s archive. Inserted data can be include data ingested from sources external to the site, or data/products generated on-site. Handling and preservation include quality screening of data entering and exiting the archive, quality screening of archive media, backups, and accomplishing migrations from one type of media to another. Insertion into the archive can be electronic or manual (e.g. hanging tapes on a rack or popping them into a robotic silo).

Access and Distribution - generating and providing catalog information and a search and order capability to users, receiving user requests for data, fetching the requested data from the archive, performing any subsetting, reformatting, or packaging, and providing the end product to the user by electronic means or on physical media.

User Support - user support provided in direct contact with users by user support staff, including responding to queries, taking of orders, etc.

Internal Support - support provided by an organizational element or persons within a site to the organizational elements or persons responsible for the functions listed above. Internal support will include some or all of the following as applicable at a particular site:

a. hardware maintenance or coordination of hardware maintenance by vendors;

b. maintenance and enhancement of custom applications or system software (a.k.a. sustaining engineering);

c. COTS procurement, installation of COTS upgrades;

d. system administration, database administration;

e. resource planning and management;

f. network / communications engineering;

g. logistics, consumables, facilities, security management;

h. systems engineering, test engineering;

i. property inventory and management;

j. management of internal support functions.

In addition to the functional areas described above, staff effort at the sites is also allocated to two other areas, as appropriate for the site. These are:

Site Management - management at least one level up from direct management of the work performed in the functional areas. For example, in the case of a site with a ‘front office’ and several operating elements, the ‘front office’ staff would be allocated to Site Management while the management within the operating elements would be allocated across the functional areas in which each operating element is engaged.

Research and Development - research or development effort that is not engaged with the current work of the performed within the functional areas. Examples would be personal research performed (usually on a part time basis) by scientists on the staff of an operating element of a site, or development aimed at new missions or work outside or beyond the scope of the work within the functional areas. (Development and maintenance of, for example, applications software used currently would be allocated to Internal Support as noted above.)

Because of the extreme disparity (mainly because of the different ways the ‘data center’ fits in its parent organization) across the sites in these two areas, they will not be considered in the cross-site analysis.

The reference model includes ‘standard’ or common metrics for the functional areas and for the data center as a whole. Section 2 describes a set of twenty nine site level reference model metrics for each of the functional areas given above - limited by the fact the twenty nine reference model metrics do not all apply to every functional area. Section 3 described a set of overall reference model metrics that are derived from the site level metrics defined in section 2.

2.4 Reference Epoch

An approximate reference epoch of late 2000 was selected for the study. Thus information collected for the study might be information as of the end of calendar year 2000, the end of fiscal year 2000, or where appropriate, for the full year CY 2000 or FY2000. Workload measures were annualized, i.e. annual rates (such as products generated per year or data volume ingested per year) were computed when partial year information was used. Workload rates were aligned in time with staffing data. In some cases staffing and/or workload were changing significantly during the 2000 timeframe. In such a case, given a certain level of staff effort as applicable for a given period of time, an annual workload performed by that staff would be computed from workload data for (as closely as possible) the same time period.

2.5 Selection of Sites for the Study

Sites were selected for the study because they provided a point of comparison with the DAACs in at least two or three of the data center reference model’s functional areas. In some cases a data center might include all of the functional areas, in other cases a site include only a couple of areas, and might not consider itself, or be considered, a “data center” at all. For the purpose of the study, it was not necessary to restrict the selection of study sites to those which included all of the model’s functional areas, it was only necessary to ensure that across the sample of sites selected there would be multiple instances of each functional area.

The selected sites handle data comparable to that handled by the DAACs - science data collected from satellite and/or in-situ platforms, and products derived from those data, which means that these sites serve a somewhat similar user community and face similar problems in their data handling. All of the sites have an operational role as do the DAACs, some facing more stringent operational requirements than the DAACs.

The site selection resulted in a sample of sites embracing a fairly wide range of scale in terms of the magnitude of their workload and their size. The selected sites are listed below in section 2.9.

2.6 Sources of Information for the Study

This study would not have been possible without the generous cooperation of the eleven selected “survey” sites and the three DAACs. Some sites completed a survey asking background questions about the site, its mission, and activities (survey responses are included in each site’s appendix). Each survey site permitted a visit by members of the study team and provided briefings about its operation and a tour of its facilities. Site visit reports prepared by the study team members were a key input to this study (and are included in the appendix for each site). Site staff provided additional information, reviewed and commented on drafts of the study report sections discussing their site, and answered follow-up questions. In addition, extensive use was made of information gleaned from site websites and documents. Any errors of transcription or interpretation of site information that may appear in this study are entirely the responsibility of the study team.

The appendix for each site includes responses by the site to the site survey (as available), the site visit reports written by the study team members, other information provided by the sites (often email exchanges), and references to site documents and websites (URLs) from which information was obtained.

2.7 Mapping Sites to the Data Center Reference Model

One phase of the analysis consists of examination of the work and staffing of each site, and mapping these to the data center reference model. The results of the mapping are presented are presented in this report at a summary level and in detail in each site’s appendix, showing the mapping of the sites activities to the functional areas, and the workload and staffing associated with each functional areas.

Because the actual staffing of a study site would only correspond by coincidence to the model, the staffing levels associated with the functional areas are best thought of as measures of effort rather than a designations of specific positions or persons to functions. In many sites individual staff members are associated with more than one functional area of the reference model. For example, especially at a smaller site, a group of operators may be collectively responsible for monitoring ongoing ingest, processing, archive insertion, and distribution processes, with any members of the group dealing with exceptions in any phase of the operation as they arise. The operators are not each assigned to a functional area, rather they share a joint responsibility for the operation as a whole. For the study, the total effort of the group is allocated to functional areas based on an estimate of the relative effort expended on each, even though the areas are not treated as distinct by the site or the group of operators involved.

From the point of view of the site, the mapping to the model’s functional areas might seem awkward and artificial, and often assumptions must be made to provide some basis for the mapping. These assumptions are noted.

Not all of the sites map fully to the reference model - not all sites will have work that maps to all of the model’s functional areas.

2.8 Normalization of Metrics

Once the mapping is accomplish, individual site metrics for the functional areas can be computed. A set of ‘standard’ metrics was defined, to be developed from each site’s information as they apply. But for these to be compared across sites, some degree of normalization must be used to compensate for the wide range of differences between the sites in volume of work and size of staff.

The first normalization factor used in this study is a simple computation of annualized work measures per staff effort (as in terabytes ingested per year divided by the FTE level of effort associated with ingest). Refinement of the normalization is an area for further thought. The annualized rates of work per effort can be thought of as rough measures of productivity.

The result is a standard set of twenty nine metrics that are compiled for each individual site (as they apply) and which then become the basis for the cross-site comparative analysis described in Section 3. The standard metrics set is defined below in the second Appendix.

2.9 Selected Study Sites

The sites that were considered in this analysis are:

1. NCDC - NOAA/NESDIS National Climatic Data Center,

2. NCEP - NOAA/NWS National Centers for Environmental Prediction, Central

Operations (NCO),

3. STScI - Space Telescope Science Institute,

4. MODAPS - MODIS Adaptive Processing System, NASA/GSFC,

5. MARF/MPEF - the EUMETSAT Meteorology Archive and Retrieval Facility

(MARF) and Meteorological Product Extraction Facility (MPEF),

6. CERSAT - (Centre ERS d’Archivage et de Traitement),

7. BADC - British Atmospheric Data Center,

8. IPD - NOAA/NESDIS Office of Satellite Data Processing and Distribution