14 May 2002 Meeting Minutes
SRE Space Systems Reliability Tools Standards Working Group
The 4th meeting of the Space Systems Reliability Tools Standards Working Group was held on Tuesday, May 14, 2002, from 8:30 AM to 11:30 AM PDT. The meeting consisted of two separate teleconferences. One teleconference was mediated at The Aerospace Corporation in El Segundo, CA, and the other teleconference was mediated at DSI International, in Orange, CA. The meeting agenda is on page 3.
The objective of the SSRT Standards WG is to develop a commercial standard that provides a single framework for linking different reliability assessment tools. This framework shall be built by defining critical process addresses[1] and standard formats for all data elements used in appropriate identification, analysis, and verification of Reliability, Maintainability, and Availability (RMA) requirements for space systems. In the context of this standard, “appropriate identification, analysis, and verification…” means there would be negligible risk of adverse effects from using the results. The title of the standard shall be, “Standard Format for Space System Reliability Computer Applications,” and its scheduled completion date is 30 September 2002.
The WG is organized into two teams. Team 1 is tasked with defining the data elements and their critical process addresses. The members of Team 1 are all Reliability Engineering experts and their lead is Tyrone Jackson. Team 2 is tasked with defining the standard formats for the data elements. The members of Team 2 are all reliability tool developers and their lead is Dan Hartop.
Participants at the May 14th meeting were:
NAME COMPANY PHONE E-MAILSteve Harbater TRW 858-592-3490
Dan Hartop(2) DSI Intl. 714-637-9325
Jim Sketoe Boeing 253-773-2891
Al Jackson CSULB Eng Grad College 310-493-7469
Tyrone Jackson(1) Aerospace Corp. 310-336-6170
Xuegao (David) SoHar Inc. 323-653-4717
Walt Willing Northrop Grumman 410-765-7372
(1) Meeting coordinator and Team 1 lead
(2) Team 2 lead
The following individuals are on regular distribution for the SSRT Standards WG Meeting minutes:
Mike Canga NASA JSC 281-483-5395
J C Cantrell Aerospace Corp. 310-336-2899
Terry Kinney Spectrum Astro 719-550-0325
Robert Poltz Design Analytx 877-327-7550
Kamran Nouri Item Software 714-935-2900
James Womack Aerospace Corp. 310-336-7647
John Ingram-Cotton Aerospace Corp. 310-336-1249
Dave Dylis RAC 315-339-7055
Eric Gould DSI Intl. 714-637-9325
Jim Kallis Raytheon 310-647-3620
Bill Geimer Northrop Grumman 626-812-2783
Leo F. Watkins Lockheed Martin 817-935-4452
Marios Savva Reliasoft 520-886-0410
Adamantios Mettas ReliaSoft 520-886-0366 Ext. 29
Doug Ogden ReliaSoft 520-886-0366 Ext. 41
Rich Pugh Pratt Whitney
Ken Murphy ARINC 505-248-0640
Chuck Anderson GRC International 281-483-4087
Myron Hecht Sohar Inc. 323-653-4717X111
Rebecca Menes Sohar Inc. 323-653-4717X101
Bob Miller TRW 310-812-2840
Kevin P. Van Fleet Relex Software 724-836-8800 x105
Hunter Shaw Relex Software 724-836-8800
Clarence Meese SRE
May 14th Meeting Agenda
Time SRE Working Group Administrative Topics
8:30 - 8:40 PDT Take roll
Vote to approve the minutes of the April 30th meeting
Remind participants to pay their SRE membership dues
Time Team 1 Discussion Topics
8:40 - 9:00 PDT Discuss Status of Action Items from the April 30th Meeting – Tyrone Jackson
9:00 - 9:20 PDT Discuss the Electrical Stress Derating Analysis Flow Diagrams – Steve Harbater
9:20 - 9:50 PDT Discuss the Reliability Prediction Process Flow Diagram for the Preliminary Design Phase – Jim Sketoe
9:50 - 10:00 PDT Break
10:00 - 10:45 PDT Discuss the First-Cut Standard Formats for Reliability Data – Tyrone Jackson
Team 2 Discussion Topics
8:30 - 9:50 PDT Discuss Status of Action Items from the April 30th Meeting – Dan Hartop
9:50 - 10:00 PDT Break
10:00 - 10:45 PDT Begin Developing a Draft Outline for the Standard, which is titled, “Standard Format for Space System Reliability Computer Applications” – Dan Hartop
Team 1 & 2 Summary
10:45 - 11:30 PDT Summary and Review of Actions Items – All
11:30 PDT Meeting Adjourn
Team 1 Discussion Topics
· Team 1 participants in the April 30th meeting were:
q Tyrone Jackson (Team Lead)
q Steve Harbarter
q Walt Willing
q Jim Sketeo
· The group did not meet the minimum number of participants required for a Team 1 quorum and decided to postpone the vote on approval of the April 30th meeting minutes until the next scheduled meeting on May 28th.
· The group agreed that Visio 2000 diagrams should be converted to Visio 5 format before distribution to the working group for review.
· The group reviewed Steve’s Stress Electrical Derating Process Flow Diagram and accompanying write-up. Steve mentioned that sometimes the secondary parameters are not included in the stress derating analysis to save money. Tyrone volunteered to develop draft definitions for some of the electrical stress derating parameters. He plans on using the Fortran source code for an old MIL-HDBK-217 program to build a list of component-specific derated parameters.
· The group reviewed Jim’s Reliability Prediction Process Diagram for the Preliminary Design Phase. The group agreed that unit level and component level trade studies are often performed during the Preliminary Design Phase. Therefore, the use of reliability data to support trade studies should be added to Reliability Prediction Process Diagram. Jim will modify the diagram.
· The group discussed the widespread trend away from piece part FMECA. Walt said that, at a minimum, FMECA should be performed to identify the effects of failures at the interfaces of a Line Replaceable Unit (LRU). He added that identifying internal failure modes of an existing LRU would not be efficient use of an analyst’s time, but identifying internal failure modes of a new or modified LRU would be efficient use of an analyst’s time. The group agreed with Walt.
· The group agreed that FMECA should be used to validate the Reliability Block Diagram (RBD), and both the FMECA and RBD should begin at the same level of indenture.
· The group agreed on the following concepts:
o In an ideal world, where tools are available to apply all reliability methods with equal effort to all items, the preferred order of reliability methods would be:
1. Field data
2. Test data
3. Physics of failure (PoF) equations if they were derived from applicable test data
4. Handbook reliability prediction equations if they were derived from applicable field data
o The MTBF calculation for COTS should be based on either field data or test data.
o In the real world (at least for now), handbook reliability prediction methods are the most cost effective choice for MTBF calculations because:
§ Insufficient field and test data is available for all items in modern space systems.
§ A key goal of the Responsible Design Engineer (RDE) should be to eliminate all wearout mechanisms that can affect mission success. Therefore, PoF would not be necessary if this goal is met.
§ Cost effective PoF tools are not available.
o Some of the problems associated with handbook reliability prediction methods include:
§ Use of proprietary parameters
§ Failure rate equations that were not derived from field data
§ Unknown confidence bonds for calculated failure rates
§ Assumed exponential (constant) failure rates for all items
§ Lack of a comprehensive set of hazard rate equations for non-electronic parts
§ Lack of a comprehensive set of non-operating failure rate equations for electronic parts
· Tyrone discussed an example for a standard reliability data format that he derived from the old B1 and B2 sheets in MIL-STD-1388-A. The example consists of predefined keywords that have origination points identified on critical process flow diagrams. The points on the diagrams serve as data addresses. To allow consistent identification of the data by different reliability assessment tools, the keywords are arranged in an indentured configuration that is based on data dependency. Take for example, a spacecraft Mean Mission Duration (MMD) prediction. Its standard electronic data interchange format might look something like this:
RELIABILITY
PREDICTION
MMD
RWEIBULL (Rayleigh-Truncated Weibull)
SCALE = 60.0
SHAPE = 1.75
BWEAROUT (Begin Wearout) = 36
MWEAROUT (Mean Wearout) = 48
CONFIDENCE = 0.5
UNITS = MONTHS
Team 2 Discussion Topics
· Team 2 participants in the May 14th meeting were:
q Dan Hartop (Team Lead)
q David Xuegao
q Al Jackson
· The group met the minimum number of participants required for a Team 2 quorum.
· The following tasks have been completed:
o Created, Updated and Reviewed Initial Schema
o Documented Updated Schema Considerations for review by Team 2
o Discussed potential Interoperability paths and approach
· As a side note, DSI will ultimately create an XSL style sheet (just a fancy XML document for automatically changing XML into something useful) for converting a Fault Tree XML (FTML ?) document into an Excel XML Spreadsheet (supported by Excel 2002). This will be accomplished sometime over the next few months at DSI's availability. Therefore, we will commit to an Action Item that will not have a definite date other than by September 2002.
· Team 2 Future Agenda
o Team 2 - Complete review of Gate Types, ensure consistent parsing for existing tools
o Team 2 - Define interoperability paths for Fault Tree and other Schemas
o Team 1 - Provide input to Team 2 regarding current schema
Action Items
1. Team 1 Action Items –
a. All – Review the updated Fault Tree Schema that Team 2 constructed. Specifically, check for correctness, completeness, and compliance with the stated objective of standard (see page 1).
b. Jim – Update the diagram for the Reliability Prediction Process during the Preliminary Design Phase. Specifically add references to Reliability Trade Studies and FMECA.
c. Tyrone and Steve – Tackle the Team 2 action item to begin developing a draft outline for the standard, which is titled, “Standard Format for Space System Reliability Computer Applications”.
d. Tyrone – Construct a flow diagram for Similarity Analysis that shows how individual reliability assessment tasks might be integrated at the Reliability Program level.
e. Tyrone – Develop draft definitions for some of the more typical electrical stress derating parameters.
f. Tyrone – Write a draft guide and construct Reliability Analysis Process Flow Diagrams for the Detailed Design Phase.
2. Team 2 Action Items –
a. All - Review the updated Fault Tree Schema. Specifically, check for correctness, completeness, and compliance with the stated purpose of standard (see page 1).
b. SOHAR - Define interoperability (inputs and outputs to existing tools).
c. SOHAR - Complete review for completeness of Gate Types.
d. John - Review and update schema documentation.
e. All - Review Team 1 documentation & findings.
Next Meeting
The next SSRT Standards WG Meeting is scheduled for May 28, 2002, at 8:30 AM PDT. Team 1 and Team 2 will hold separate teleconferences from 8:30 AM to 10:45 AM PDT. At 10:45 AM PDT, Team 1 will join the Team 2 teleconference to discuss progress and actions. The following teleconference numbers are to be used:
q Team 1 teleconference number - (888) 550-5969, pass code 646354
q Team 2 teleconference number - (888) 550-5969, pass code 162080
Arrangements have been made for Team 1 to use NetMeetingÔ concurrently during the teleconference. For those that prefer face-to-face discussions, meeting rooms have been reserved at the following locations:
q Team 1 meeting room - The Aerospace Corporation, Building D-8, 200 N. Aviation Boulevard, El Segundo, CA 90245-4691
q Team 2 meeting room - DSI International, 1574 N. Batavia, Suite 3, Orange, CA 92867
Planned Future Meetings
Location: The Aerospace Corporation, Building D-8, 200 N. Aviation Boulevard, El Segundo, CA 90245-4691
Date: 2002
5/28 Teleconference
6/11 Teleconference
6/25 Teleconference
7/16 Teleconference
7/30 Teleconference
8/13 Teleconference
8/27 Teleconference
9/10 Teleconference
8/24 Teleconference
Please direct all comments regarding these meeting minutes to:
Tyrone Jackson
SSRT Standards Working Group Coordinator
Tyrone Jackson
Reliability & Statistics Office
The Aerospace Corporation
Ph. (310) 336-6170
Fax (310) 336-5365
Email:
Top-10 problems that affect the Reliability Programs of Space Systems as determined by an internal working group survey:
1. Valuable reliability lessons learned often are not in a format that is readily useable by the Reliability Program, or they have become “lessons forgotten” or “lessons ignored”.
q
2. Some reliability critical items often are not identified at all or are not properly controlled.
q
3. System reliability predictions often do not include probability of occurrence estimates for all relevant failure modes, failure mechanisms, and failure causes. (Probability of an induced fault during manufacture, or probability of damage during assembly often is not included in reliability predictions.)
q
4. The perceived accuracy of high-precision system reliability predictions often is not supported by the input data which is of lower precision that the result.
q
5. The steadily shrinking pool of “experienced” Reliability Engineering specialists is unable to meet the needs of a steadily growing number of space system development projects.
q
6. Many commercial reliability assessment tools have major shortcomings that may not be obvious to the casual reliability analyst (e.g., inaccurate equipment failure rate models, use of unverifiable parameters in equations, high misapplication rates, etc.).
q
7. Often, insufficient funding is provided to perform all of the tasks necessary for a High-Reliability Program. (Some customers and managers believe that high-reliability can be tested-in more cost-effectively than it can be designed-in.)
8. Different approaches are being used across the space industry to perform reliability assessment tasks that are called by the same name, but which often serve different purposes. (Inconsistency in reliability assessment practices has become a major problem since DoD canceled military standards in the late 90’s.)
9. Some customers’ believe that all dependability predictions for space vehicle constellations are too conservative. (The basis of this belief is rooted in historical evidence that shows contingency procedures of ground operations are very effective for extending the useful life of a space vehicle far beyond it’s predicted mean-life. This phenomenon has resulted in many customers buying more space vehicles than necessary to meet the dependability requirements of the constellation.)
10. Sometimes the reliability analyst cannot take advantage of (or is unaware of) some of the critical data paths that link a particular task of the Reliability Program with:
a. Other tasks within the Reliability Program;
b. Systems Engineering Process functions outside the Reliability Program; or
c. External product-related data sources.
d.
1
[1] The critical process addresses may be defined using machine-readable alphanumeric symbols or human-readable Extended Machine Language (XML) keywords.