Program Execution Services
Edited by Andrew Grimshaw December 18, 2003
Job: A job is a grid service (named by a distinct GSH) and is created at the instant requested … even though no resources have been committed. The job does not necessarily “run” on/in the container where the actual “job” is run, whether the “job” is a legacy or not. A job encapsulates a job document. A job is not a workflow. A job may be instantiated by a job factory.
A job goes through distinct states, started, suspended, restarted, terminated, completed, etc. These states are kept in the job document (described below).
Jobs will have a manageability interface that will include functions such as start, stop, migrate, limit resource consumption, check resource consumption, etc.
Sub-type picture here.
It is envisioned that the job type will be sub-classed into many different subclasses. For example, we envision a “legacy_job” subclass that has methods such as redirect TTY IO, reading and writing of local files, capturing a checkpoint, etc.
Job Document: A job document describes the state of the job, e.g., the agreements that have been acquired, the JSDL, how many times it has been started. By state we do not mean for example the internal memory of a blast job. The job document is encapsulated by a job and exposed as service data of the job.
Container:
A container “contains” running services, e.g., a queuing service, a Unix host, a J2EE hosting environment, or a collection of containers (a façade or a VO of job containers). Job containers will have attributes (a.k.a. SDE’s) that describe both static information such as what kind of executables they can take, e.g., OS version, libraries installed, policies in place, security environment/QOS, etc., as well as dynamic information such as load, QOS issues, etc.
A container also establishes a “context”. Definition of “context” in this context means agreement, security, I/O environment contexts.
Port types
start – GSH of a job service?
kill
signal
check-point
deploy – install an application or application component, e.g., binary, libraries
Throws
lots of different exceptions
SDE’s
load
CMM kind of stuff
A container is a subclass of a managed resource – and will have a manageability interface.
Class/subclass diagram
Resource
Container
J2EE
Executable
Unix
BatchQueue
LSF
PBS
SGE
etc.
Windows
Your_favorite_execution_environment
Vault
File_system (a la C stdio libraries)
RDBMS
JDBC
DB2
Oracle
MySQL
Your_favorite_storage system
Containers will have various relationships to other resources that will be exposed to clients. For example, a container may have a “compatibility” relationship with persistent state “vaults” that indicates that services running “in” a container can access persistent data “in” a particular vault. Similarly other managed resources might be a deployed executable, a physical network, etc.
The relationships with other resources are critical. We expect that sets of managed resources will be composed into higher level services – for example a “container” may be extended to a host-container for example that includes a “container”, a “vault”, an OS, etc.
Vault. A vault represents a container for persistent state. It may be implemented many different ways, by a file system, by database, by a hierarchical storage system, etc. A vault is a subclass of a resource, and therefore has a manageability interface. Similarly, like a “container”, it has relationships to other resources is maintained. A vault will have methods to get a “handle” to persistent state that it is managing. (Called a “persistent address”) The form of the handle will depend how the state is actually stored. A persistent address may be a path name in a file system or a primary key value in a database. The key idea is that the persistent address can be used to directly access the data.
Vaults will also have methods for managing their contained state, including passing it to other vaults. This will facilitate both migration and replication.
Job Manager. The manager manages jobs. For example it may be a workflow manager, it may be an array job, it may be a portal that interacts with users. It may deal with failures and restarts, it may schedule them to resources, it may collect agreements and reservations. It is responsible for making the job – or set of jobs.
It is very likely to be a subtype of the WSDM collection, which is a collection of manageable entities. A WSDM collection can expose as its methods some of the methods exposed by the members of its collection. There may be a stop, kill, etc.
The manager is responsible for orchestrating the set of services to start a job or set of jobs, e.g., negotiating agreements, interacting with containers, monitoring and logging services, etc. It may also aggregate job service data from underlying “related” job instances.
Port-types
instantiate a job
instantiate a set of jobs
destroy a job
reschedule (migrate) a job or set of job
list set of jobs – via service group (bag)
Negotiate new agreements
“May” implement job service port types as a means of “inheriting” (at least the manageability port types).
Workflow manager – may be a subtype of a “job manager”
Execution Planning Service (EPS) (a.k.a. ) Scheduling service
A service that builds a temporal relationships jobs and resources, e.g., containers and vaults.
Definition: A schedule is a mapping (relation) between grid services and resources, with possibly, time constraints. (Do we include precedence relationships?)
Proposal: A schedule can be extended with a list of alternative “schedule delta’s” that basically say, if this part of the schedule fails, try this one instead”.
We need to begin thinking about port types, inputs and outputs. It is likely that the input document will be some sort of JSDL document.
Candidate set generator
Builds a 1 to many relation of job (grid service) to resource candidate mappings, i.e., where it is possible to map a service.
There may be a default information service.
Inputs:
A set of GSH’s (e.g., for job services)
optional : the GSH of a information service
<One vote for passing in a list of individual “resources”>
optional: a static set of resources passed in.
Output:
A set of lists <GSH_of_service, list of “resource” GSH’s>
The lists are unordered, there is no implied preference or ordering.
Information Service
A discovery service is an example. The basic idea is that an information service is a place where one can find information about resources. Similar to MDAS services and collections.
------to here ------
Deployment service
Is able to deliver appropriate, compatible, “executables” for particular named services for particular named job containers. This will be used primarily by services such as the job manager, possibly job container services.
Reservation services
Reservation services manage reservations for resources, interact with accounting services (there may be a charge for making a reservation), revoke reservations, etc.
Monitoring
Simply starting something up is often insufficient. Applications (which may include many different services/components) often need to be continuously monitored, both for fault-tolerance reasons as well as QOS reasons. For example, the conditions on some hosts that caused the scheduler to select it may have changed, possibly indicating that the task (service instance) needs to be rescheduled.
Fault-Detection and Recovery Services
These may or may not be a part of monitoring, and may include support for managing both simple schemes for stateless functions that allow trading off performance and resource usage, to slightly more complex schemes that manage checkpoint and recovery of single thread (process) jobs, to more complex schemes that manage applications with distributed state, e.g., MPI jobs.
Accounting/billing/logging services
Accounting, auditing, logging, and billing services are critical for success of OGSA outside of academia and the government. This will include the ability for schedulers to interact with resources to establish prices, as well as for resources to interact with accounting and billing services.
"Compatibility" checking services
One of the problems faced in the grid is being able to determine which hosts (CMM hosting services) are candidates for the execution of a service: not all services can run on all hosts. In reality there may be many different implementations of a service with different QOS features as well as different hosting requirements. For example, Java, Sun native implementations, AIX native implementations. Unfortunately, just saying it is a Sun native implementation is insufficient to determine if a binary can run on a particular Sun. There are OS versions, installed libraries, license restrictions, etc. A compatibility checking service will determine whether a particular implementation can execute on a particular host.
Licenses management services
License management services will be needed to manage access to licenses and ensure that the appropriate licenses are available when needed.
Queuing
Queuing services are higher level services that have enq, deq, re-prioritize, get status, etc. Queuing services will be implemented using other services such as schedulers, VO’s, data provisioning, and so on. These will be the user-facing part for legacy codes.
Data provisioning
Getting the data to applications is key. This may involve copying and replications services such as gridFTP, or grid file system services, caching services, and backend data services as described in the OGSA Data Services draft.
Directory Services
Often resources, applications, files, and other services will need to be “named” by humans. A GSH is an awkward – and perhaps not useful way for humans to name things. Therefore directory services that map human readable names, e.g., path name strings, to GSH’s will be very handy. For example, I might want to call an application /bio/apps/blast, or a collection of resources contained by a VO /hosts/Virginia/CS. A similar notion targeted at files is under consideration in the Data Area. I propose that this should be expanded to cover all sorts of services – not just files, and that such a naming scheme will simplify program execution. For example, a user may specify that a job should run on a particular resource “grid_run –h /hosts/Virginia/CS /bio/apps/blast inputfile” and tell