/ The Accounting (ACCO) system requirements
Dec 6th, 2005

Table of Contents

The Accounting (ACCO) system requirements 1

1 Introduction 2

1.1 Purpose 2

1.2 Scope 2

1.3 Definitions, acronyms and abbreviations 2

1.4 References 3

1.5 Overview 3

2 General Description 4

2.1 Users 5

2.2 System prospective 6

2.2.1 System Interfaces 6

2.2.2 User Interfaces 6

2.3 Main Functionalities 6

2.4 Constraints 6

2.5 Assumptions and Dependencies 7

3 Use Cases 7

3.1 Actors 7

3.2 Site Resources Manager Use Cases 7

3.2.1 Use Case 1 - CE usage at specific site per VO 7

3.2.2 Use Case 2 - Storage Space Usage at Specific Site per VO 8

3.2.3 Use Case 3 - Storage I/O usage at specific site per VO 9

3.3 VO manager use cases 11

3.4 VO member use cases 11

3.5 Accounting system administrator use cases 11

4 Specific requirements 11

4.1 External interface requirements 11

4.2 Functional requirements 11

4.2.1 Accounting data gathering 11

4.2.2 Processing accounting data to produce accounting records 11

4.2.3 Storing and managing accounting records 11

4.2.4 Displaying accounting information 11

4.3 Performance requirements 11

4.4 Design constraints 11

4.5 System Attributes 11

4.5.1 Availability 11

4.5.2 Security 11

4.5.3 Maintainability 11

4.5.4 Portability 11

1  Introduction

1.1  Purpose

The purpose of this document is to identify a set of requirements for an accounting system the first deployment of the OSG infrastructure. More specifically, for the first version of the OSG the aim is to define a minimal data model for the accounting information, ensure the necessary collectors and sensors are in place at the resource providers sites, and define and deploy repository and access tools for reporting and analyzing grid wide accounting information.

This document is mainly directed to the resource provider managers and to the OSG Blueprint team. The goal is to develop a common understanding of what the resource providers’ needs and expectations are for the first OSG version and to understand how to integrate the accounting system with the OSG architecture.

It assumed that the reader is familiar with the mission and organization of the OSG consortium. For further info go to www.opensciencegrid.com

1.2  Scope

In the long term vision, the OSG consortium proposes to bring together a very heterogeneous group of resource providers and resource consumers, each one of them with a different set of requirements and interests. In order to succeed in such and endeavor the OSG infrastructure must provide its users with precise and reliable information about resources utilization. Availability of such information will

·  allow resource providers to directly link resources utilization with experiments and projects goals,

·  improve resource planning and organization at the resources providers sites

·  strengthen security of the Grid infrastructure as well as of the participating sites ,

·  and eventually support automatic resource allocations and consumption based on an economic model.

In this prospective the OSG Accounting system (ACCO) proposes to collects resources data utilization from Grid components and services, to process them producing some accounting records that resource providers and consumers can easily access and analyze through a simple interface. The ACCO system, at this early stage of the OSG development, will not be concerned with supporting an economic or pricing model for automatic resource allocation and consumption and will be build by reusing as much as possible the deployed Grid3 monitoring infrastructure.

1.3  Definitions, acronyms and abbreviations

Add here the definition found in [4], paragraph 2.

ACCO / The accounting system for the OSG
Accounting data / Any data that can be processed to create an accounting record. For example measurements taken by the monitoring system are accounting data. Also other accounting systems records can be considered accounting data.
Accounting record / A record produced by the accounting system from accounting data. An accounting record is the fundamental unit of information in the accounting database.

1.4  References

[1]  The Life Cycle Data Harmonization Working Group of the Software Engineering Standards Committee of the IEEE Computer Society, IEEE Recommended Practice for Software Requirement Specification, IEEE 830-1998, www.ieee.org.

[2]  Peter Garfjall, Accounting in Grid Environments, an architectural proposal and a prototype implementation, Master Thesis, 27 May 2004, Umea University, Sweden.

[3]  The Blueprint team, A Blueprint for the Open Science Grid, www.opensciencegrid.org

[4]  M.Mambelli et al., Grid2003 Monitoring, Metrics, and Grid Cataloging System, CHEP2004.

1.5  Overview

Section 2 of this document gives an overview of the ACCO system and relates it to other parts of the OSG infrastructure. In section 3 we present the use cases and in section 4 detailed functional and non functional requirements.

Figure 1: The ACCO and its relations to other OSG logical components.

2  General Description

In figure 1 a logical diagram shows how the ACCO relates to other systems in a Grid environment. The boxes indicate logical Grid components, sub-systems or services; the arrow between the boxes indicates which of the components, sub-systems or services or actor initiate the action. For example if we consider the ACCO and the Grid monitoring system the arrow indicates that the ACCO system uses the Grid monitoring system to gather monitoring data (we can also say that the ACCO uses one of the Grid monitoring system interface to get some data out of it)

In the lower part of the diagram we have the Grid components, sub-systems or services that produce accounting data. The accounting data can be either monitoring data or other systems accounting records. The ACCO accounting records will be produced by processing monitoring data and other systems accounting records. In the upper part of the diagram the accounting records consumers are displayed: these are users and the VO management system.

Note that for completeness a Site Accounting system and Monitoring system are depicted. These are sites depended systems each one with its own metrics collecting sensors and data format.

2.1  Users

For the first deployment of OSG the main users are:

·  Site Resources Manager: he is fully responsible for the managing the computing resources of a site or computer center. He is mainly interested in understating and being able to document how the sites computing resources are supporting the various VOs work.

·  VO manager: he is responsible for managing a VO. He interacts with the Accounting system on behalf of the VO. He is also interested in understating how the sites computing resources are supporting his VO work but he is also interested in understating how the VO members utilizes the VO resources.

·  VO member: he is interested in understanding what resources he has used so far at Grid level as well as at the site level.

·  Accounting system administrator: he is responsible for the maintenance of the Accounting system. This user will have unrestricted access to the all Accounting records.

For this version of the requirements we do not consider software component for resource brokering and possible users interested in enforcing site policies or auditors.

Figure 2: The Grid3 monitoring system architecture

2.2  System prospective

The OSG the Accounting system will fully leverage the existing Grid3 monitoring system infrastructure to implement some of its functionalities. Figure 2 depicts the Grid3 monitoring system architecture. For more information about the deployed Grid3 monitoring system see [5].

2.2.1  System Interfaces

Interface with VO Management Service (GUMS server)

Interfacing with the Grid Monitoring System

Interface with the site SE, SRM and GridFTP

Interface with the site CE, Parsing Gatekeeper log file

Interfacing with Site Monitoring system

Interfacing with Site Accounting system

2.2.2  User Interfaces

The Accounting system will have two main types of interfaces. A web interface for navigating accounting records and display accounting information summaries both in plots and tables format: this will be very simple to use, the users will not require any special training and a full help section with complete explanations of the accounting information will be provided. This interface will be based on drop down menus such that users will be able to choose what and how to display accounting records and analyze them. Accounting records will not be modifiable from this interface. The second interface will be also a web interface but it will be use for managing the ACCO system. This interface will be more complex than the first one and it will allow full manipulation of the stored accounting records. Authentication will be needed to access this management interface.

Issue: at this stage is not clear to me how the users will want to view the accounting information: do they want plots? Just numbers and tables?

2.3  Main Functionalities

The Accounting system will have the following main functionalities:

1.  The ability to gather accounting data about Grid resource utilization at each site in the Grid.

2.  The ability to process gathered accounting data (monitoring system records and sub-system accounting records) in order to extract meaningful (to the user) usage information (accounting records and summaries).

3.  The ability to store and manage accounting records.

4.  The ability to display and analyze the accounting records in a clear and unambiguous way.

2.4  Constraints

2.5  Assumptions and Dependencies

3  Use Cases

3.1  Actors

The actors of our uses cases are the 4 users listed in the previous section plus the VO management system.

3.2  Site Resources Manager Use Cases

3.2.1  Use Case 1 - CE usage at specific site per VO

Description / The Site Resource Manager wants to find out how much of his site CPU time (system time + user time) was used by a certain VO in a given period of time.
Flow of events / The Site Resource Manager opens a browser and goes to the OSG web accounting page. Using some pull down menus he specifies that he is interested in accounting information for a specific site and that he wants accounting information for the CE resources, he selects the VO name, the time period and the time unit (seconds, hours or days). The ACCO system generates a web page that contains:
1.  a link to view the accounting records for the specified period of time
2.  the information specified below,
3.  a set of buttons to change the way the information is displayed and perform some data analysis
Output information / Name: # of acc. records
Symbol: NAR / Number of accounting records within the specified period of time.
Name: # jobs submitted
Symbol: JSUB / Number of jobs submitted to the site CEs by either a user or a resource broker.
Name: # jobs accepted
Symbol: JACC / Number of jobs accepted by the CEs; a CE accepts a job based on different checks and criteria, for example: the Grid Certificate has to be valid, all the site policies are respected, etc. Once the job is accepted the CE puts it on the local job manager queue.
Name: # completed jobs
Symbol: COMJ / A job is completed when execution finishes without unexpected exceptions. For instance: a job tries to read some files but it fails, if the user code handles the exception and exit gracefully that the job is completed.
Name: # failed jobs
Symbol: FAIJ / A job failed when execution is stop because of an unexpected exception. For example a SEGV or the job fails to read a file because the file is not where it is suppose to be and the job’s code does not catch the exception and the OS halts the execution (the job crashed)
Name: total CPU time
Symbol: CPUT
Units: seconds, hours, days / This is the sum of the CPU user time and CPU system time for each completed and failed job in the given period of time. For example if my CE runs 2 jobs in one day, 1 completed and one failed then the total CPU time = (user CPU time of job 1 + system CPU time of job1) + (user CPU time of job 2 + system CPU time of job2).
Name: time period
Symbol: TIPE
Units: / This is the interval of time for which the actor wants to get the accounting information. The accounting period start and end time must be shown.
Output format
Notes / If the site has more than one CE deployed the system will display the accounting information for each separate CE as well as for combined CEs accounting records.

Issues:

Output format: ask user, tables? plots? What kind of plot?

Job classification: what is the official definition of a job lifecycle?

What about hanging jobs?

What about memory?

3.2.2  Use Case 2 - Storage Space Usage at Specific Site per VO

Description / The Site Resource Manager wants to find out how much of his site storage space was reserved and used by a certain VO in a given period of time.
Flow of events / The Site Resource Manager opens a browser and goes to the OSG web accounting page. Using some pull down menus he specifies that he is interested in accounting information for a specific site and that he wants information on the storage resource, he selects the VO name, the time period and the storage unit (MB-hours, GB-hours or TB-hours). The ACCO system generates a web page that contains:
1.  a link to view the accounting records for the specified period of time
2.  the information specified below,
3.  a set of buttons to change the way the information is displayed and perform some data analysis
Output information / Name: # of acc. records
Symbol: NAR / Number of accounting records within the specified period of time.
Name: space-type
Symbol: STYP / Following the SRM classification of type of space we consider volatile, durable and permanent space types.
Name: storage media
Symbol: SMED / Storage media gives information about the media that stores the data. This field should also specify the degree or reliability of the media and access speed.
Name: guarantee reser. space
Symbol: GURE
Units: MB-hours, GB-hours, TB-hours / See SRM doc.
Name: best-effort reser. space
Symbol: BERE
Units: MB-h, GB-h, TB-h / See SRM doc.
Name: guarantee used space
Symbol: GUUS
Units: MB-h, GB-h, TB-h / See SRM doc.
Name: best-effort used space
Symbol: BEUS
Units: MB-h, GB-h, TB-h / See SRM doc.
Name: time period
Symbol: TIPE
Units: / This is the interval of time for which the actor wants to get the accounting information.
Output format
Notes / If the site has more than one SE deployed the system will display the accounting information for each Seas well as the information obtained by combining the accounting record of all the SEs.

Issues: