Annual Report for 1 June 2005 through 30 May 2006:

Activities, Findings, and Contributions

1Introduction

2Deployment Status & CSA06

2.1Hardware Infrastructure

2.1.1California Institute of Technology

2.1.2University of California San Diego

2.1.3University of Florida

2.1.4University of Wisconsin at Madison

2.2DISUN Performance during CSA06

3Distributed Computing Tools Activities

3.1CMS Application Software Installations on the Open Science Grid

3.2Monte Carlo Production System Development

3.3Monte Carlo Production at the US CMS Tier-2 Centers

3.4Towards Opportunistic Use of the Open Science Grid

3.5Scalability and Reliability of the Shared Middleware Infrastructure

4Engagement and Outreach

4.1“Expanding the User Community”

4.2From local to global Grids

5Summary

6References

1Introduction

We develop, deploy, and operate distributed cyberinfrastructure for applications requiring data-intensive distributed computing technology. We achieve this goal through close cross-disciplinary collaboration with computer scientists, middleware developers and scientists in other fields with similar computing technology needs. We deploy the Data Intensive Science University Network (DISUN), a grid-based facility comprising computing, network, middleware and personnel resources from four universities: Caltech, the University of California at San Diego, the University of Florida and the University of Wisconsin, Madison. In order for DISUN to enable over 200 physicists distributed across the US to analyze petabytes per year of data from the CMS detector at the Large Hadron Collider (LHC) project at CERN, generic problems with a broad impact will be addressed. DISUN will constitute a sufficiently large, complex and realistic operating environment that will serve as a model for shared cyberinfrastructure for multiple disciplines and will provide a test bed for novel technology. In addition, DISUN and the CMS Tier-2 sites will be fully integrated into the nation-wide Open Science Grid as well as with campus grids (e.g., GLOW), which will make this shared cyberinfrastructure accessible to the larger data intensive science community.

This report is prepared for the annual agency review of the US CMS software and computing project. It so happens that these reviews are out of phase by 6 months from the DISUN funding cycle. As a result, we report here only for the 6 month period since the last DISUN Annual Report, and refer to the latter report for additional information.

The DISUN funds of $10 Million over 5 years are budgeted as 50% hardware and 50% personnel funds. The hardware funds are deployed as late as possible in order to maximally benefit from Moore’s law while at the same time providing sufficient hardware to commission the computing centers, and overall cyberinfrastructure, and provide for computing resources to prepare for the physics program of the CMS experiment. The personnel is fully integrated into the USCMS software and computing project. Half of the DISUN personnel is dedicated to operations of the DISUN facility, and is fully integrated into the US CMS Tier-2 program, lead by Ken Bloom.

The other half of the personnel is part of the “distributed computing tools” (DCT) group within US CMS, which is lead by DISUN. The focus of this effort within the last 18 months, has been in the following areas. First, DCT contributes to the development of the CMS Monte Carlo production infrastructure that is used globally in CMS. Second, DISUN is responsible for all CMS Monte Carlo production on OSG. This includes both generation at US CMS Tier-2 centers, as well as opportunistic production on OSG, and we will discuss those two separately below. Third, DISUN centrally maintains CMS application software installations on all of OSG. These installations are used both by the Monte Carlo production as well as the user data analysis on the Open Science Grid. As fourth focus area, DISUN is working with Condor and OSG on scalability and reliability of the shared cyberinfrastructure. In addition to these primary focus areas DISUN is engaged in outreach and engagement to enable scientists in other domains to benefit from grid computing, and to work with campus, regional, and global grids other than the Open Science Grid on issues related to interoperability, and in general towards the goal of establishing a worldwide grid of grids.

This document starts out by describing our hardware deployment status, followed by the performance achieved within CSA06, the major service challenge within the last 6 months. Section 3 then describes the DISUN activities within the context of DCT, while Section 4 details the outreach and engagement activities within the last 6 months.

2Deployment Status & CSA06

DISUN has a strong operations and deployment component including funding for $5 Million in hardware and 4 people for the duration of 5 years across the four computing sites: Caltech, UCSD, UFL, and UW Madison. This section describes the hardware deployment status, as well as the performance achieved during CSA06, the major CMS service challenge within the last 6 months.

2.1Hardware Infrastructure

The main CMS milestone to meet for this year was a 20% deployment of hardware infrastructure in both complexity and scale. The complexity goal was met by deploying services, and demonstrating their successful operations during Service Challenge 3 in Fall 2005, as well as a variety of milestones since. These services included:

  • Functional compute cluster including batch system.
  • Functional storage cluster including dCache.
  • Functional Wide Area connectivity of 10Gbps shared, and demonstrated LAN bandwidth at 1MB/sec per batch slot.
  • Functional Open Science Grid software stack, including Compute Element, Storage Resource Manager (SRM/dCache), MonALISA Monitoring, fully configured General Information Provider (GIP), among others.
  • The CMS specific PhEDEx data transfer system.
  • The CMS specific PubDB data publication mechanism.

The WAN goals are DISUN specific, and exceed the USCMS S&C goals. The goals for deployed hardware scale were set at 20% of the 2007 goal of 200 TB of storage and 1000 kSpecInt2000 of compute power. These goals were defined in coordination with USCMS S&C. They are thus goals to be met at the end of the calendar year 2005, and don’t exactly reflect end of year 1 deployments. As some of the sites had remaining hardware funds from other projects, we furthermore decided to delay spending DISUN hardware funds whenever the deployment goals could be met with other funds. This decision is motivated by a desire to purchase hardware as late as reasonable in order to maximally benefit from Moore’s Law. All four sites easily met the deployment goals, as discussed in detail below for each site.

DISUN is playing a leadership role in data movement, both within the context of global CMS, as well as the Open Science Grid. This includes all aspects of data movement, from the networks to storage to performance characterization to monitoring as is discussed in a number of places in this report. To be able to play this leadership role, all four sites provisioned at least 10Gbps shared WAN connectivity during this year. This was accomplished at all four sites independent of DISUN hardware funds.

2.1.1California Institute of Technology

In the first year of DISUN funding, Caltech continued to upgrade its prototype Tier2's computing hardware and software together with its network infrastructure.In September 2005, thirty 1U dual-core Opteron 275 worker nodes were purchased using DISUN funds, at a per node cost of ~$4,400, amounting to a total spend of ~152k$ including costs for mounting hardware and miscellaneous items. Each of the nodes comprises a Supermicro H8DART motherboard housing the processors, a 3ware 9500 LP SATA hard disk controller managing four 300GB Seagate drives, 4GBytes of DDR4000 RAM, a CD and a floppy drive.After augmenting the capacity of the Tier2 with these nodes, the configuration of the cluster was simplified by combining the Opteron-based and Xeon-based subclusters we had been managing as two separate entities for historical reasons. The result is a mixed-architecture cluster comprising 66 nodes in total, adding up to 254 kSPECInt2000.

The Caltech Tier2 offers 34TBytes of disk space in a resilient dCache, in addition to 1.8TByte of NAS storage for OSG.Networking equipment to support the Tier2 interconnects between worker nodes, head nodes, and to the WAN, has also been purchased in 2005. In particular, two 48-port Gbit modules for insertion in our Force10 E600 switch used for worker node connections, two 4-port 10Gbit modules and XFP optics for external connections to the Force10, two 48-port Foundry Gbit switches with integrated 10Gbit XFP optics, a SANbox 5200 Fibre Switch with sixteen 2Gbit ports and SFP optics (intended for SATABEAST or other FibreChannel-based future storage), and around ten Neterion 10Gbit NICs. The total cost charged to DISUN funding for this equipment and miscellaneous related items was $20k.

2.1.2University of California San Diego

At the start of DISUN, the UCSD group was operating a compute and storage cluster for the CDF and CMS experiments comprising of 200 kSi2000 compute power and 20TB of disk space across about 94 computenodes and two infrastructure nodes, plus a 5TB RAID5 fileserver. The cluster networking consisted of a Gbps HP ProCurve switch that did not allow for 10Gbps WAN uplinks. UCSD thus met the CPU goals for 2005 prior to the start of DISUN. DISUN’s focus at UCSD during year 1 was thus on networking, storage, and infrastructure services to meet the complexity goals.

The cluster was migrated to a CISCO 6509 in late Summer 2005, including 4x48Gbps ports and 4x10Gbps ports. Three of the four 10Gbps uplinks are presently connected. A connection to CENIC was established in late Summer 2005, followed by a connection to ESNet in Fall 2005, and a connection to the TeraGrid in early 2006. The TeraGrid connection was temporarily enabled for SC05, and exercised at the level of a few Gbps of iperf traffic. It then took a few more months to sort out the administrative details to route our SRM/dCache WAN connections on the TeraGrid production network. The TeraGrid connectivity is now being used for data export in support of an MRAC allocation for generating Monte Carlo for several Electroweak physics analyses at CDF, including WW, WZ production as well as search for Higgs decays to WW.

At this point, the DISUN cluster with its 3x10Gbps connections has better WAN connectivity than any other cluster located at the San Diego Supercomputing Center, including the TeraGrid infrastructure itself which was recently scaled down from 3x10Gbps to 1x10Gbps. We have started exercising our WAN connectivity with detailed tests to understand and eliminate bottlenecks in end-to-end performance as detailed in Section 2.2 below.Existing compute nodes were retrofitted with an additional 48x400GB drives in late Summer 2005 to meet the 2005 storage goals. A total of 20 infrastructure nodes were commissioned and deployed to host a set of services exceeding the minimum required by CMS goals. The additional services include a Clarens server for JobMon (section 3.5), a Xen node (section 3.9), squid servers for CDF and CMS. In particular, a special DISUN focus at UCSD is SRM/dCache, and we therefore deployed a more distributed infrastructure solution than at any other site outside of FNAL to support the development, integration, and testing focus in this area described in Sections 2.1 and 2.2 below. We are presently waiting for delivery of 32 Dual CPU Dual Core Opteron with 2TB disk space each. We expect this to be fully in production before the end of year 1, and thus in time for our participation in SC4 and CSA06, the summer and fall 2006 LHC computing service challenges.The total hardware procurement adds up to about $225k, out of which about $50k is paid for by DISUN funds. On the personnel side, we filled our last open position at UCSD in January 2006.

2.1.3University of Florida

At the start of DISUN the University of Florida Tier2 facility consisted of 65 dual 1GHz PIII compute nodes, 4 dual Xeon 3.0 GHz servers used for CMS interactive analysis and several other master and administrative nodes. The facility also included 6 dual Xeon or Opteron class RAID fileservers that provided a total of 11.9 TB of storage. The local area network was provided by a Cisco 4003 switch, plus a few spare ports on the departmental switches. The 4003 connected the compute nodes at 100 Mbps. The WAN uplinks to all components is in the range of Gbps. In the Spring of 2005 UFL purchased two large RAID fileserver, each with 9.6 TB of disk in preparation for SC3.

The facility underwent a major renovation in the Fall of 2005. First, the networking infrastructure was upgraded to Gbps Ethernet throughout with 2x10 Gbps uplinks to UFL’s 10 Gbps Campus Research Network. The network equipment connecting our servers is a fully populated Cisco 6509 with 288 GigE ports and a total of 4x10 GigE ports. Two of these 10 GigE ports provide our connectivity to the WAN and the Ultralight network through the 10 Gbps Campus Research Network. All of our publicly exposed servers are on the 10 Gbps network. Florida also procured two new racks of servers to upgrade the computational resource. A total of 86 new dual core dual Opteron 275 (2.2 GHz) each with 500 GB of local storage and 4 GB of RAM were installed and commissioned in November of 2005. The total computational capacity at UF now consists of 484 kSi2000. With the local storage of the new servers included, UFL’s total raw storage capacity is now 71 TB.

In February of 2006 UFL purchased a third rack of dual core dual Opteron 280 (2.4 GHz) servers configured with a TB of local storage and again 4 GB of RAM. The 44 new servers will add another 271 kSi2000 to the computational resource and another 44 TB of storage, for a grand total of 755 kSi2000 and 115 TB of disk. The described resource, in its entirety, is available for use by CMS and is configured specifically in accordance with CMS requirements. A portion of it, however, is shared with local departmental faculty, staff and students conducting research on CMS and other HEP and astrophysics experiments. The total hardware procurement in 2005-2006 adds up to about $540K. This includes storage hardware purchased in the Spring of 2005 and the two large procurements of computational hardware purchased in the Fall of 2005 and early 2006. Out of the total spent on hardware $50K came from DISUN hardware funds.

2.1.4University of Wisconsin at Madison

In year one of DISUN, the University of Wisconsin expanded its CMS Tier-2 cluster by adding 184 CPUs (46 dual-CPU/dual-core 1.8 GHz Opterons), 52 TB of disk space, and 10 CMS Tier-2 servers. We also upgraded to a dedicated 10 Gbps Internet connection via private 10gigabit ethernet segments directly to Startap in Chicago.Our additional worker nodes added around 221 kSpecInt2000, resulting in a total dedicated CMS Tier-2 computing capacity of around 675 kSpecInt2000.And our 90 TB of raw disk space provides about 49.5 TB of (RAID or resilient dCache based) dCache storage.

We decided to make a sizeable investment into increasing the cluster at Wisconsin this year, and thus clearly exceeding the USCMS scale goal by a significant margin because of the special focus on Monte Carlo Production for CMS at Wisconsin. Samples generated by UW on the DISUN, GLOW, and OSG grids are generally hosted at Wisconsin for some period of time, so as to provide access to CMS users before they are moved to permanent storage at FNAL and CERN. In this way, CMS benefits greatly by having increased CPU power at Wisconsin as CMS is improving its data handling infrastructure to minimize latencies between the time the Monte Carlo is produced and the time it is available to the collaboration at FNAL and CERN.In total, upgrade expenditures at UW Madison since June 2005 add up to about $230,000, of which around $180,000 was funded by DISUN.

2.2DISUN Performance during CSA06

In addition to the full simulation Monte Carlo described above, DISUN has started large scale generation of Alpgen Monte Carlo for W and Z plus jets. This is part of a class of physics simulations that follow what’s known as the “Les Houches Accord”. A dozen or more generators exist that all simulate different aspects of hadron collider physics at leading or next to leading order in perturbation theory. These generators generally simulate only parton level kinematical quantities, and depend on Pythia or other programs like it for the fragmentation of the parton showers. Generating even a small fraction of the samples expected during the first year of the LHC requires sizeable computing resources. On the plus side, the executables used to generate the physics are completely independent of CMS software.DISUN has started using Alpgen simulations to stress test the USCMS computing infrastructure. We are generating samples now, and can reuse them later as the CMS detector simulation software improves, and use this generation exercise to commission the computing infrastructure. In March 2006 a total of roughly 80,000 CPU hours were consumed on the OSG by the Alpgen Monte Carlo production effort. The Alpgen effort is focused at DISUN UCSD.

3Distributed Computing Tools Activities

In addition to deployment and operations, the DISUN proposal called out seven specific areas of work during Phase 1 (first 2 years) of the project. This section describes accomplishments in these areas during the first year of DISUN, and indicates ongoing effort. Each area has received attention generally at more than one but not all of the collaborating institutions. The four institutions have diverse strengths, and synergies with other projects they are engaged in. As a result, we focus the effort at the four institutions on issues they are best suited for, in order to maximally benefit from existing expertise across all four institutions. We identify who contributed where in the various sections explicitly.

3.1CMS Software Deployment Framework and Validation

3.1.1Description of the Framework

Deploying software on distributed computing facilities requires at least three components: a transport mechanism of software deployment, a local software installation tool at a remote computing facility, and a tool to send a signal to deploy software.

We use the Open Science Grid (OSG) for the transport mechanism. For the local software installation, XCMSInstall, which is a CMS custom packaging and is based on the RPM packaging tool, had been used until May 2006. Beginning June 2006, CMS decided to use the Debian packaging tool called the APT and this packaging is used for the actual software installation currently. For the tool to trigger the software deploy, we developed a CMS Software Deployment portal based on a traditional CGI scripting technique with the capability of X509-based authentication to allow software deployment to Grids. A diagram that shows the software deployment implementation is shown in the figure below.