Charge for the Reliability Working Group

Version 1 - DRAFT Date: 01/29/2018

The goal for the possible number of failures of the components of the DUNE Far Detector inside the Liquid Argon cryostat is very ambitious: less than 0.x% of the readout channels should be lost over the lifetime of experiment (20-30 years?) without any possibility of access or repair. Reliability of all the components must be incorporated in the design and a dedicated analysis of all the possible failure mechanisms is required before finalizing the design of all ASICs, printed circuit boards, cables, connectors, and their supports, that are housed inside the DUNE Far Detector cryostat. The custom ASICs proposed for use in DUNE (BNL FE ASIC, joint LBNL-FNAL-BNL ADC, COLDATA ASIC, SLAC CRYO ASIC) incorporate design features aimed at minimizing the hot carrier effect that is recognized as the main failure mechanism for integrated circuits operating at liquid Argon temperature. This is not sufficient to ensure that the reliability requirements of DUNE will be met by the final design of the electronics inside the cryostat. There are not many examples of HEP detectors that have been operated without intervention for a long period of time, suffering limited losses of readout channels, and in extreme conditions like those of the DUNE cryostats. Particular care should be used in the choice of components like connectors that are usually one of the major sources of failures in detectors. Only space satellites (that usually have planned lifetimes in the range of 5-10 years) have requirements similar to those planned for DUNE. FERMI/GLAST is an example of a joint project between NASA and HEP groups that had a planned lifetime of five years and is achieving its goal of ten years of operations in space.

The charge of the Reliability Working Group is to examine all possible failure mechanisms and produce a set of recommendations for the design of the Singe Phase TPC Cold Electronics components that are housed inside the liquid Argon. These recommendations will be used, if necessary, as part of the selection criteria between different ASIC designs. As part of its activities the working group should

·  Review the segmentation of the cold electronics to understand which failures are going to have the largest impact on data taking.

·  Revisit recommendations for the ASIC design, beyond those aimed at minimizing the hot carrier effect.

·  Revisit the industry and NASA standards for the design and fabrication of printed circuit boards, connectors, and cables, and make recommendations for the quality assurance / quality control procedures to be adopted during the fabrication of the cold electronics components. The use of 3D X-ray and automated visual inspections, and other techniques, for quality control during production should be investigated.

·  Understand where it is desirable, necessary, and feasible to implement redundancy in the system to minimize the data losses caused by component failures.

·  The results from the protoDUNE production should be analyzed to understand the observed components’ failures. Examples of ASICs and boards that have not passed the qualification tests or have failed during the commissioning should be used for the investigation and development of new techniques to be used for the QA/QC process, and for improving, when appropriate, the design.

We expect this working group to evolve into a working group tasked with developing the quality control program to be used during production, but in the short term we want to focus on understanding what area of the cold electronic design is most crucial to ensure that the reliability requirements of DUNE are met. We expect to have regular reports to the Cold Electronics consortium about the activities of this Reliability Working Group and to include these reports in the presentations to the DUNE Technical Board and to the LBNC.

1