Semiconductor Equipment and Materials International
3081 Zanker Road
San Jose, CA 95134-2127
Phone:408.943.6900 Fax: 408.943.7943
hb khghgh1000A4743A
Background Statement for SEMI Draft Document 4743A
New Standard: SPECIFICATION FOR COMMUNICATION OF DATA QUALITY
Note: This background statement is not part of the balloted item. It is provided solely to assist the recipient in reaching an informed decision based on the rationale of the activity that preceded the creation of this document.
Note: Recipients of this document are invited to submit, with their comments, notification of any relevant patented technology or copyrighted items of which they are aware and to provide supporting documentation. In this context, “patented technology” is defined as technology for which a patent has issued or has been applied for. In the latter case, only publicly available information on the contents of the patent application is to be provided.
Background
With the semiconductor industry moving to e-Manufacturing, the data the equipment generates is critical to improving equipment productivity. The quality of the reported data is of paramount importance to ensure successful use of software applications that use and/or analyze these data. It impacts SECS-II, EDA,and other standard interfaces to factory systems and is closely tied to the verification of the interface protocols.
The current Guide for Understanding Data Quality (SEMI E151) describes data quality terminology and a process for understanding data quality. The ultimate goal of the task force is to develop a Data Quality standard that contains the following components: Metrics, Communication Templates, an Application Note, and Reporting Compliance Methods as indicated in the roadmap below. The Communication Templates, an Application Note, and Reporting Compliance Methods are dependent upon the data quality metrics; therefore, it is appropriate to first standardize on these metrics.
This specification will help determine, specify, and verify data quality metrics for anyone producing or using data from semiconductor equipment via a SEMI standard communication interface.
The results of this ballot will be discussed at the next North America I&C committee meeting on November 10, 2010 in conjunction with the NA Standards Fall 2010 meetings in San Jose, CA.
SEMI Draft Document 4743A
New Standard: SPECIFICATION FOR COMMUNICATION OF DATA QUALITY
1 Purpose
1.1 With the semiconductor industry deploying e-Manufacturing, the data equipment generates is critical to improving equipment productivity. The quality of that reported data is of paramount importance to ensure successful use of software applications that use and/or analyze the data. It impacts both SECS (SEMI Equipment Communications Standard) and EDA (Equipment Data Acquisition) interfaced systems. The purpose of this standard is to provide a generally acceptable specification for communicating the quality of semiconductor equipment data. This standard is intended to be used in the following ways:
- To establish a set of metrics for communication of quality of data produced by semiconductorequipment.
- To communicate the quality of selected data parameters produced by semiconductor equipment.
- To provide a measurable level of data quality to ensure quality of service and/or quality of performance of applications used by equipment suppliers, semiconductor manufacturers, equipment subsystem suppliers, and control system suppliers.
- To provide data quality reporting compliance methods.
2 Scope
2.1 This specification defines metrics and calculations for reporting equipment data quality. The data within scope of this standard is any data that is retrievable from semiconductor equipment via electronic interfaces. Data quality metrics allow for the quantification of a data quality attribute or element so that this attribute or element can be communicated and verified. This document defines data quality metrics corresponding to the Aspects,Attributes,and Elements of data quality as defined in the Guide for Understanding Data Quality (SEMI E151-0309).
NOTICE: This standard does not purport to address safety issues, if any, associated with its use. It is the user’s responsibility to establish appropriate safety and health practices and determine the applicability of regulatory or other limitations prior to use.
3 Limitations
3.1 This standard does not specify how the quality of reported data shall or could be improved. It provides a method to communicate the existing quality of reported data.
3.2 The term “data” as used in this version of the standard, is currently limited to data that is or can be collected from production equipment in the semiconductor industry via a standard electronic interface. In the future the standard may be expanded to address data that can be collected from other entities within the semiconductor factory.
3.3 This standard does not address the protocol or structures of messaging used to transport or communicate data values. That messaging infrastructure is dealt with in other SEMI standards.
3.4 The data quality metrics contained in this standard are intended to apply specifically to the variable value content of those messages,which constitutes thedata.
3.5 This specification does not cover the methodology for communication of data.
3.6 This specification does not provide test methods for verification of data quality.
3.7 This version of the specification does not include the communication templates or the data quality reporting compliance methods. These will be developed for later versions of the standard. This version of the specification provides metrics for measuring data quality attributes and elements as defined in SEMI E151.
3.8 This standard does not specify which parameters should be subjected to this specification of data quality.
4 Referenced Standards and Documents
4.1 SEMI Standards
SEMI E5— SEMI Equipment Communications Standard (SECS-II)
SEMI E30— Generic Model for Communications and Control of SEMI Equipment (GEM)
SEMI E37 —High-Speed SECS Message Services (HSMS) Generic Services
SEMI E54 —Sensor/Actuator Network Standard
SEMI E120 — Specification for the Common equipment Model (CEM)
SEMI E125 — Specification for Equipment Self Description (EqSD)
SEMI E128 — Specification for XML Message Structures
SEMI E132 — Specification for Equipment Client Authentication and Authorization
SEMI E134 — Specification for Data Collection Management (DCM)
SEMI E138 — XML Semiconductor Common Components
SEMI E145 — Classification for Measurement Unit Symbols in XML
SEMI E151 — Guide for Understanding Data Quality
NOTICE: Unless otherwise indicated, all documents cited shall be the latest published versions.
5 Terminology
5.1 Abbreviations and Acronyms
5.1.1 EDA — Equipment Data Acquisition Interface, this refers to the suite of standards including E120, E125, E128, E132, E134, E138, and E145
5.1.2 SECS-II — SEMI Equipment Communications Standard
5.2 Definitions
5.2.1 Full Scale — the maximum (max) value that can be represented.
5.2.2 Reporting Rate — the number of messages per unit of time reported through an interface.
6 Conventions
6.1 Requirements Identification— the following notation specifies the structure of requirement identifiers.
6.1.1 The following requirements prefix format is (see Table 1) used at the beginning of the requirement text.
- [Essss.ss-RQ-nnnnn-nn]
- The following suffix format is used to mark the end of the requirement text.
- [/RQ]
- Requirement IDs apply across the row (left to right) of a table.
Table 1Requirement Identifiers
Format Notation / PurposeE / SEMI standards volume designator. Example: E for Equipment, M for Metrics
sss.ss / SEMI standards specification identifier. Examples: E087.00, E087.01, E134.00
RQ / Indicates this is a requirement identifier.
nnnnn / Unique five digit number within this specification 90000-99999 is reserved for use by SEMI.
nn / Unique two digit version number for this requirement. Value 00 is the first version.
7 Overview
7.1 The Guide for Understanding Data Quality (SEMI E151) defines three aspects of Data Quality: Availability, Interpretability, and Usability. These aspects have attributes and elements as defined in SEMI E151. These attributes and elements are of interest in the reporting of data quality.
8 Data Quality Metrics
8.1 Tables 2-4 list the data quality metrics, delineated by Aspect, Attribute. and Element, as defined in SEMI E151. This includes listing the data quality aspectsassociated with the parameter and the metrics used to evaluate the related data quality attribute or element of interest. These metrics can be applied to a single data parameter or a group of data parameters from a single interface.
[Exxx.00-RQ-00001-00] When auser wishing to comply with this standard is defining a particular data parameter they shall use Tables 2-4 and specify the appropriate data quality attribute or element from the defined tables by selecting the appropriate subset of rows that relate to the specification or definition of that data parameter[/RQ].
[Exxx.00-RQ-00002-00] The supplier shall enter the values for the selected attributes or elements using the defined formats or metrics [/RQ].
8.2 Note that not all metrics are required for the elements shown in Tables 2 - 4 when assessing the data quality of a particular data parameter. The Requirement Identifier’s Requirement Type indicates whether or not metrics for these elements or attributes are required to be specified to describe the data quality aspects of the selected data parameter in consideration; the notation used in the column is defined in Section 6.
Table 2Data Quality Metrics - Availability
Requirement ID / Attribute / Element / Metric / Reqd / Metric Unit / Calculation / Verification Method / Value Expression / Comments including specifications on conditional[Exxx.00-RQ-00003-00] / Interface / N.A. / Interface / Y / Enumerated list consisting of at least:
<“SECS-II”, “EDA”, or “Sensor Actuator Network> / Selection of exactly one element from the list / Text string equal to exactly one element from list / SECS-II (E5, E30, E37)
Sensor Actuator Network (E54)
EDA (E125, E134)
The row indicates the interface used for communicating data.
When using multiple interfaces use multiple instances of the communication template; one for each interface.
[Exxx.00-RQ-00004-00] / Mechanism / N.A. / Mechanism / Y / Any text / Existence of Text / Text / A description of the mechanism by which the data is communicated over the interface(e.g., response to S1F3 Selected Equipment Status Request in SECS-II or response to GetParameterValueRequest() in EDA).
#1 “Reqd” means “Required”
Table 3Data Quality Metrics – Interpretability
Requirement ID / Attribute / Element / Metric / Reqd / Metric Unit / Calculation / Verification Method / Value Expression / Comments including specifications on conditional[Exxx.00-RQ-00005-00] / Protocol / N.A. / Protocol / C / Enumerated list of protocols including those referenced used in the “E54-Sensor Actuator Network Standard”, e.g., “LonWorks”, “Profibus”, Modbus/TCP”, “CC-Link”, “Ethernet-IP”, Profinet”, “SafetyBus p”, “A-Link”, “EtherCAT”, or other defined protocols / Selection of one or more protocols from the enumerated list / List of one or more text strings where each element is one protocol from the enumerated list / The associated language or sensor actuator network communication protocol. Conditionality is that this metric shall only be used if the value of the Interface metric is “Sensor Actuator Network”.
A zero length list means that none of the listed protocols are supported. If the Interface is SECS-II or EDA then this field is not applicable.
[Exxx.00-RQ-00006-00] / Format / N.A. / Format / Y / Finite list of formats provided over the Interface / Selection of one or more formats from list where each of these formats are provided in the dictionary associated with the Interface / List of one or more text strings where each element is one format from the list. The list consists of formats from E5 (SECS), E138 (EDA), or E54 (Sensor Actuator Network). / Format relates back to the parameter data type definition for the interface selected. All the formats in the list shall be associated with the same data quality. If a particular format results in different data quality then that data element shall be listed separately.
[Exxx.00-RQ-00007-00] / Units / N.A. / Units / Y* / Finite list of units provided over the Interface / Selection of one or more units from the list where each of these units is identified in the standard associated with Interface.
*If a parameter is not associated with a unit then the parameter shall be unit less, e.g., a ratio / Finite list consisting of units from E5 (section 12), E145 (section 9), or E54.1 (Appendix 1). / Units normally apply only to a limited number of numeric parameters. All of the units in the list shall be associated with the same data quality.
If a particular unit results in different data quality, then that data element shall be listed as separate rows in the communication template instance, one for each set of units that are associated with a different data quality.
[Exxx.00-RQ-00008-00] / Order / N.A. / Order / Y / Not Ordered or Ordered / A Boolean value of 0 or 1 / Boolean
0 – Not Ordered
1 – Ordered / Order by which data is reported. The attribute applies only to arrays or structured data.
Table 4Data Quality Metrics –Usability
Requirement ID / Attribute / Element / Metric / Reqd / Metric Unit / Calculation / Verification Method / Value Expression / Comments including specifications on conditional[Exxx.00-RQ-00009-00] / Accuracy / N.A. / Accuracy / Y / % of Range or % of Full Scale or
% of Value / {Absolute value of [Average of (reported value – actual value)]} assessed at multiple equidistant points across the entire range. This value is then divided by the Range, Full Scale, or actual Value of the data depending on how the value is to be expressed / A tuple. The first value in the tuple is the calculated accuracy, expressed as a percentage of Units combined with the indication of what that percentage applies to one of the metric choices. The second value states which of the metrics representations is being used, namely accuracy as a percentage of 1=Range, 2=Full Scale, or 3=Value. The third value is an indication of sample size used to calculate the average (a positive integer). / Specified for numeric parameter values only. e.g., 2% of Range.
A lower value indicates more accurate data.
[Exxx.00-RQ-00010-00] / Accuracy / Range / Maximum value / C / Value / Maximum value is used in the calculation of other data quality metrics / Units / Specified for numeric parameter values only. Expressed in units.
Specified only if parameter has a maximum reportable and measurable value.
[Exxx.00-RQ-00011-00] / Accuracy / Range / Minimum value / C / Value / Minimum value is used in the calculation of other data quality metrics / Units / Specified for numeric parameter values only. Expressed in units.
Specified only if parameter has a minimum reportable and measurable value.
[Exxx.00-RQ-00012-00] / Accuracy / Bias / Maximum bias / C / +/- Value / Maximum value of bias assessed at multiple points across the entire range / A tuple. The first value is the maximum bias, expressed in Units. The second value is an indication of how many points were used to assess the bias (a positive number). / Specified for numeric parameter values only. Bias expressed in units. The information associated with the second value in the tuple may or may not be provided. If it is not provided the tuple value is set to zero.
Specified only if parameter source has a concept of Bias.
[Exxx.00-RQ-00013-00] / Accuracy / Bias / Average bias / C / +/- Value / Average value of bias assessed at multiple points across the entire range / A tuple. The first value is the calculated average bias, expressed in Units. The second value is an indication of how many points where used to calculate the average (a positive number). / Specified for numeric parameter values only. Bias expressed in units. The information associated with the second value in the tuple may or may not be provided. If it is not provided the tuple value is set to zero.
Specified only if an average bias has been calculated. This should be measured at equidistant points across the entire range.
[Exxx.00-RQ-00014-00] / Accuracy / Drift / Maximum drift / C / Value per time period / Maximum value of drift assessed for a parameter over a continuous time period across the entire range / A tuple. The first value is the calculated maximum drift expressed in Units. The second value is the time period over which the maximum drift is being expressed. / Specified for numeric parameter values only. Drift is expressed in units/second.
[Exxx.00-RQ-00015-00] / Accuracy / Drift / Average drift / N / Value per time period / Average value of drift assessed for a parameter over a continuous time period across the entire range / A tuple. The first value is the calculated average drift expressed in Units. The second value is the time period over which the average drift is being expressed. / Specified for numeric parameter values only. Drift is expressed in units/second.
[Exxx.00-RQ-00016-00] / Precision / Repeat-ability / Standard deviation / N / Value per sample size / Standard deviation of a number of consecutive measurement values which is called “sample size” / A tuple. The first value is the Standard Deviation expressed in Units. The second value is the sample size expressed as a positive integer. / Specified for numeric parameter values only. Specified only if parameter has a normal distribution or vendor wishes to report repeatability using this metric.
[Exxx.00-RQ-00017-00] / Precision / Resolution / Resolution / C / Number of significant digits / N.A. / A positive integer / Specified for numeric parameter values only. Resolution to which data is both accurate and reportable. Note that resolution may be limited by the data type.
[Exxx.00-RQ-00018-00] / Frequency / Sampling Rate / Minimum Sampling Rate / Y / Value/second / Minimum number of samples that are guaranteed to be collected divided by a time period / A tuple. The first value is the minimum sampling rate expressed in Hertz. The second value is the measurement time period. The third value is the standard deviation of the change in the interval between samples over the measurement time period. The fourth value is a text field indicating the conditions under which the minimum sampling rate requirement will always be met.
The “conditions” could refer to issues such as the state of the tool (e.g., idle, processing) and the amount of parameters being collected and their collection rate. Note that this is a calculation of average minimum sampling rate over a time period.
[Exxx.00-RQ-00019-00] / Frequency / Reporting Rate / Minimum Reporting Rate / Y / Value/second / Minimum number of reports of a parameter that can be guaranteed divided by a time period / A tuple. The first value is the minimum reporting rate expressed in Hertz. The second value is the measurement time period. The third value is the standard deviation of the change in the interval between reports over the measurement time period. The fourth value is a text field indicating the conditions under which the minimum reporting rate requirement will always be met. / Specified for numeric parameter values only. Minimum reporting rate that can be guaranteed across the interface.
Note that this is a calculation of average minimum reporting rate over a time period.
The “conditions” could refer to issues such as the state of the tool (e.g., idle, processing) and the amount of parameters being collected and their collection rate (i.e., bandwidth of data being exported from the equipment).
[Exxx.00-RQ-00020-00] / Latency / Tool Latency / Average Latency / Y / Value / Average latency of reports of a parameter. If synchronous this is with respect to being reported at Minimum Reporting Rate across the entire time period / A tuple. The first value is the average latency expressed as a time period. If the value is reported synchronously (e.g., sampled), the second value is the measurement time period. If the value is reported asynchronously (e.g., event report), the second value is set to zero.
APPENDIX 1