/ DocuShare Handle # / Date Effective / Status
LSE-82 / 17 July 2011 / Version 1.1
Author(s)
Gregory Dubois-Felsmann
Data Management
Id Identification Tags
DM Sizing Model
Document Title
LSST Science and Project Sizing Inputs Explanation

The Large Synoptic Survey Telescope

(LSST)

LSST Science and Project Sizing Inputs Explanation


Document Control Sheet

Version / Date / Description / Owner name
1.0 / Original document created to match LSE-81 / Gregory Dubois-Felsmann
1.1 / 17 July 2011 / Updates to match version 6 of LSE-81 / Gregory Dubois-Felsmann

Introduction

The Data Management Sizing Model (see Collection-413) begins with a collection of key parameters that drive the scale of the design. These come from several sources:

·  The Science Requirements Document (LPM-17) and additional estimates of the science content of the survey

·  The two levels of system requirements documents, i.e., the LSR (LSE-29) and OSS (LSE-30)

·  Additional estimates of the sizes of a variety of elements of the system, beyond those captured as system requirements

·  Key assumptions about the design of the Data Management data processing model

These inputs are collected in the spreadsheet “LSST Science and Project Sizing Inputs Explanation”, LSE-81. They are sorted into the categories that follow.

The values in the tab “SciReq” in the workbook then form an interface that is respected by the compute, storage, and network requirements estimation spreadsheets, Document-2116, Document-1779, and Document-2194, respectively. To update those models, the contents of the interface tab are simply copied to the corresponding tab in the destination workbook.

The present document will in the future be extended with detailed footnotes for the provenance of numbers obtained from previous surveys, detailed analyses, etc.

Science Estimates

The centerpiece of this section is a set of estimates of the numbers of detectable stars and galaxies in the full planned survey, assuming that the basic SRD specifications are met, and using the operations simulator and exposure time calculator to estimate the survey reach.

These estimates are subject to considerable and difficult-to-quantify uncertainty because of the unprecedented combination of breadth and depth of the LSST survey, and because of the range of filter bands available (e.g., the SDSS did not have a y filter). The use of existing narrow-angle deep fields is essential to the estimates, but the resulting precision is limited by unknown cosmic variance effects.

The galaxy estimates are based on previous surveys and on available deep-field data to explore the faint limit. The star estimates are based on Milky Way structure models derived primarily from the SDSS data, and extrapolated to the faint limit expected for the survey based on observations of nearby stars.

The star estimates are further complicated by the expected challenges of imaging and deblending in the crowded fields around the direction of the Galactic center, and because the survey design is that region is less fully explored than out-of-plane.

Expected future improvements: More careful modeling of the effective luminosity functions, as a function of filter band, and of the true detection limit achievable by the combination of detection on a chi-squared panchromatic coadd and detection in single-band coadds. More explicit linkage of the incidence of detectable variability to the photometric performance expected as a function of epoch.

Camera Specifications

These are very basic parameters of the camera (and telescope) design that drive the raw image size and the unit of sky coverage. They derive primarily from the LSR and OSS.

Survey/Cadence Specifications

Starting with parameters from the OSS, this section quantifies the scope of the full survey in sky coverage and number of raw exposures. It makes conservative (i.e., trending toward larger requirements) estimates of the number of nights and hours of observing time.

Expected future improvements: Treatment of the possibility of the main 18,000 sq. deg. survey being accompanied by less-deep coverage of additional sky area.

Engineering & Facility Database Specifications

This section contains simple estimates of the (relatively small) size requirements for the storage of telemetry from the Observatory and images and spectra from the auxiliary telescope. The telemetry estimates are taken from the OCS telemetry channel lists. They will be updated as the design of the OCS and the subsystem interfaces to it progress.

Network Requirements

The network requirements are based on the raw data volume, estimates of the availability of network bandwidth between sites and the reliability of the networks, and a model that requires excess capacity on all long-distance network links for catching up after outages.

Image Storage Requirements

This section begins the modeling of the processing components of DM. Here we specify assumptions for the sizes of various image caches that allow processing to proceed without all the image data on spinning disk, notably including an assumption that we will maintain 30-day sliding windows of recent calibrated images, with the remainder recreated upon demand, and of raw calibration image data to use as inputs for the calibration data products production.

We also document the assumptions for the construction of deep coadds (stacks of all exposures taken) and templates (coadds of exposures taken in particularly good seeing, for the purpose of transient detection and high-spatial-resolution measurement of brighter objects). This includes an assumption that separate templates will be maintained for different airmasses, to minimize the need for PSF-matching across large variations in seeing and extinction.

Data Release Production Specifications

This section expresses a model for the operation of Data Release Production and its validation with simulated data. It assumes a phased sequencing of an instance of DRP and expresses a budget of time allocated to each phase. It also includes a basic assessment of the size of the numeric data products derived from the data for each observed object.

Calibration Products Productions Specifications

Estimates the quantity of calibration data products to be produced.

User Image Access Specifications

Estimates the rate at which users will require access to image data, both small cutouts around objects and larger requests. The specifications for query size and load are taken from the SDSS experience.

Note that requests for calibrated science images trigger recomputations unless the images are in the assumed cache, mentioned above.

User Catalog Query Specifications

These estimates of query size and rate are based on the SDSS experience, the Science Collaboration survey of expected queries, and the Community Access White Paper, where they are discussed in more detail.

To do: document the specifics of the model queries in more detail.

L3 Processing Specifications

This simply expresses the notion that a flat fiat has been imposed stating that a 10% increment on the computing and storage resources required to perform the survey and the required Data Release computing will be supplied for the generation and storage of user (“Level 3”) data products.

EPO Specifications

These specifications bound the network data transfer load from DM to EPO.

Common Constants and Derived Values

This section contains the definitions of certain standard conversion constants, as well as the computation of certain commonly used sizing parameters derived from the above specifications, e.g., the size of a raw image in bytes.