© ISO/IECISO/IEC 159381:2001(E)
INTERNATIONAL ORGANIZATION FOR STANDARDIZATION
ORGANISATION INTERNATIONALE DE NORMALISATION
ISO/IEC JTC1/SC29/WG11
CODING OF MOVING PICTURES AND ASSOCIATED AUDIO
ISO/IEC JTC1/SC29/WG11
MPEG01/N4001
March 2001, Singapore
Title:Text of ISO/IEC FCD 15938-1 Information technology - Multimedia content description interface – Part 1 Systems
Source:Systems Sub-group
Editor: Claude Seyrat (Expway), Michael Wollborn (Bosch), Ali Tabatabai (Sony), Olivier Avaro (France Telecom R&D)
Status:Draft
ISO/IECJTC1/SC29/WG11N4001
Date:2001-03-16
ISO/IEC15938-1:2001(E)
ISO/IECJTC1/SC29/WG11
Modified by the SC 29 Secretariat
Information technology– Multimedia Content Description Interface–
Part 1: Systems
1
© ISO/IECISO/IEC 15938-1:2001(E)
Contents
0Introduction......
0.1Overview......
0.2Overview of this part of ISO/IEC 15938......
1Scope......
2Normative References......
3Terms and Definitions......
3.1Access Units......
3.2Fast Access......
3.3Fragment update command......
3.4Fragment Update Unit......
3.5Navigation Mode......
3.6Navigation Path......
3.7Random Access......
3.8Root Element......
3.9Schema......
3.10Terminal......
3.11Top Level Element......
4Abbreviations and Symbols......
4.1Arithmetic operators......
4.2Logical operators......
4.3Relational operators......
4.4Assignment......
4.5Mnemonics......
4.6Conventions......
4.6.1Method of describing bitstream syntax......
4.6.2Definition of bytealigned() function......
4.6.3Reserved, forbidden and marker_bit......
5Systems architecture......
5.1Terminal architecture......
5.2Access Unit......
5.3Normative interfaces......
5.3.1Description the normative interfaces......
5.3.2Validation of the standard......
6Textual Format......
6.1Decoder Configuration......
6.1.1Syntax......
6.1.2Semantics......
6.2Access Unit......
6.2.1Syntax......
6.2.2Semantics......
6.3Fragment update unit......
6.3.1Navigation......
6.3.2Fragment update command......
6.3.3Fragment payload......
7Binary Format......
7.1Binary Decoder Configuration......
7.1.1Binary Decoder Configuration Syntax......
7.1.2Binary Decoder Configuration Semantics......
7.2Binary Access Unit......
7.2.1Binary Access Unit Syntax......
7.2.2Binary Access Unit Semantics......
7.3Binary Fragment Update Unit......
7.3.1Navigation......
7.3.2Fragment update command......
8BiM fragment payload......
8.1Character string comparison......
8.2General Binary format......
8.2.1Decoding Modes......
8.3Element decoding......
8.3.1Syntax......
8.3.2Semantic......
8.3.3Internal Element Decoding......
8.3.4ElementContent......
8.4Generating keys......
8.4.1Schema compilation and structure code......
8.4.2Specific case of attributes......
8.4.3Substitution......
8.4.4Type coding......
8.4.5Fast Access......
8.4.6Datatypes Coding......
Annex C (informative) MPEG-7 meta data flow......
Annex D (informative) Informative Educational Examples for the MPEG-7 BiM......
1Example for general structure......
2Example for decoding of sub-trees......
2.1General Example......
2.2Syntax tree generation......
2.3Syntax tree simplification......
2.4Attributes coding......
2.5Decoding automata......
2.6Realized automaton......
2.7Infinite unsigned integer coding......
2.8A forward compatible coding......
Annex B (informative) Patent statements......
Annex A (informative) Bibliography......
Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical Commission) form the specialized system for worldwide standardization. National bodies that are members of ISO or IEC participate in the development of International Standards through technical committees established by the respective organization to deal with particular fields of technical activity. ISO and IEC technical committees collaborate in fields of mutual interest. Other international organizations, governmental and non-governmental, in liaison with ISO and IEC, also take part in the work.
In the field of information technology, ISO and IEC have established a joint technical committee, ISO/IEC JTC 1. Draft International Standards adopted by the joint technical committee are circulated to national bodies for voting. Publication as an International Standard requires approval by at least 75 % of the national bodies casting a vote.
The MPEG-7 standard also known as "Multimedia Content Description Interface" aims at providing standardized core technologies allowing description of audiovisual data content in multimedia environments. In order to achieve this broad goal, MPEG-7 will standardize:
-Descriptors (D): representations of Features, that define the syntax and the semantics of each feature representation,
-Description Schemes (DS), that specify the structure and semantics of the relationships between their components, which may be both Ds and DSs,
-A Description Definition Language (DDL), to allow the creation of new DSs and, possibly, Ds and allows the extension and modification of existing DSs,
-System tools, to support multiplexing of description, synchronization issues, transmission mechanisms, file format, etc.
The MPEG-7 standard consists of the following parts, under the general title Information Technology - Multimedia Content Description Interface:
-Part 1: Systems. Architecture of the standard, tools that are needed to prepare MPEG-7 Descriptions for efficient transport and storage, and to allow synchronization between content and descriptions. Also tools related to managing and protecting intellectual property. Representation of DSs and Ds and especially binary representation.
-Part 2: Description definition language (DDL). Language for defining new DSs and perhaps eventually also for new Ds.
-Part 3: Visual. Visual elements (Ds and DSs).
-Part 4: Audio. Audio elements (Ds and DSs).
-Part 5: Multimedia description schemes. Elements (Ds and DSs) that are generic, i.e. neither purely visual nor purely audio.
-Part 6: Reference software. Software implementation of relevant parts of the MPEG-7 Standard.
-Part 7: Conformance testing. Guidelines and procedures for testing conformance of MPEG-7 implementations.
This document represents the current Final Committee Draft of Part 1 of the MPEG-7 standard, the Systems Final Committee Draft.
0Introduction
0.1Overview
The MPEG-7 standard also known as "Multimedia Content Description Interface" aims at providing standardized core technologies allowing the description of audiovisual data content in multimedia environments [1]. This is a challenging task given the broad spectrum of requirements and targeted multimedia applications, and the broad number of audiovisual features of importance in such context. In order to achieve this broad goal, MPEG-7 will standardize:
-Descriptors (D): representations of Features, that define the syntax and the semantics of each feature representation;
-Description Schemes (DS), that specify the structure and semantics of the relationships between their components, which may be both Ds and DSs;
-A Description Definition Language (DDL), to allow the creation of new DSs and, possibly, Ds and allows the extension and modification of existing DSs;
-System tools, to support multiplexing of description, synchronization issues, transmission mechanisms, file format, etc.
This part of the specification describes the Systems layer, comprising the tools that are needed to prepare MPEG-7 Descriptions for efficient transport and storage, and to allow synchronization between content and descriptions. Also tools related to managing and protecting intellectual property.
0.2Overview of this part of ISO/IEC 15938
This part of ISO/IEC 15938 specifies the following tools:
- a terminal architecture defined as a normative compression layer interfaced with a non-normative delivery layer,
- an MPEG-7 elementary stream format composed of textual or binary access units,
- a standardized interface for the textual format,
- a standardized interface for the binary format.
The structure of this document is the following:
- Clause 5 specifies the terminal architecture.
- Clause 6 specifies the textual format for MPEG-7 content.
- Clause 7 specifies the binary format for MPEG-7 content.
- Clause 8 specifies the binary fragment payload.
- Annex A contains an informative overview of the flow of metadata through the content creation and delivery lifecycle.
- Annex B contains informative examples for the binary format for MPEG-7 description (BiM).
- Annexes C and D contain Patent Statements and the Bibliography respectively.
1
© ISO/IECISO/IEC 15938-1:2001(E)
Information technology Multimedia content description interface
Part 1: Systems
1Scope
This part of ISO/IEC 15938 specifies system level functionalities for the communication of multimedia content descriptions. It provides an unambiguous specification which will enable MPEG-7 users and developers:
- to develop MPEG-7 conformant decoders,
- to prepare MPEG-7 Descriptions for efficient transport and storage.
2Normative References
The following ITU-T Recommendations and International Standards contain provisions, which, through reference in this text, constitute provisions of ISO/IEC 15938. At the time of publication, the editions indicated were valid. All Recommendations and Standards are subject to revision, and parties to agreements based on ISO/IEC 15938 are encouraged to investigate the possibility of applying the most recent editions of the standards indicated below. Members of IEC and ISO maintain registers of currently valid International Standards. The Telecommunication Standardization Bureau maintains a list of currently valid ITU-T Recommendations.
3Terms and Definitions
3.1Access Units
A set of fragment update unit that are atomic in time i.e., the set of smallest description entities to which a time element can be attached.
3.2Fast Access
The process of skipping undesired parts of the stream without entirely decoding them.
3.3Fragment update command
A command that tells the decoder which transformation should be applied to the description.
3.4Fragment Update Unit
Fragment Update Unit are main components of MPEG-7 access units. They provide the dynamic aspects of the MPEG-7 description.
3.5Navigation Mode
A command that tells the decoder which instantiated part of the description it will receive.
3.6Navigation Path
The absolute or relative address of the tree node for which the decoder will receive the description.
3.7Random Access
The process of beginning to read and decode a coded representation at an arbitrary point within the description stream.
3.8Root Element
The most global element of a description i.e. the element that contains the entire description.
3.9Schema
The set of Schema Components which define a class of XML documents by expressing syntactic, structural and value constraints applicable to document instances.
3.10Terminal
The entity that uses coded representation of the multimedia content description information.
3.11Top Level Element
The children of the root element. Top level elements are chosen among a finite set of possible element defined in part 3, 4 and 5 of the standard.
4Abbreviations and Symbols
BiM / Binary format for Mpeg-7D / Descriptor
DDL / Description Definition Language
DS / Description Scheme
TBC / Tree Branch Codes
UCS / Universal Character Set
URI / Uniform Resource Identifier
URL / Uniform Resource Locator
UTF / UCS transformation formats
XML / Extensible Markup Language 1.0
XPath / XML Path Language
The mathematical operators used to describe this part of ISO/IEC 15938 are similar to those used in the C programming language. However, integer divisions with truncation and rounding are specifically defined. Numbering and counting loops generally begin from zero.
4.1Arithmetic operators
+Addition.
-Subtraction (as a binary operator) or negation (as a unary operator).
++Increment. i.e. x++ is equivalent to x = x + 1
- -Decrement. i.e. x-- is equivalent to x = x - 1
Multiplication.
^Power.
/Integer division with truncation of the result toward zero. For example, 7/4 and -7/-4 are truncated to 1 and -7/4 and 7/-4 are truncated to -1.
//Integer division with rounding to the nearest integer. Half-integer values are rounded away from zero unless otherwise specified. For example 3//2 is rounded to 2, and -3//2 is rounded to -2.
///Integer division with sign dependent rounding to the nearest integer. Half-integer values when positive are rounded away from zero, and when negative are rounded towards zero. For example 3///2 is rounded to 2, and -3///2 is rounded to -1.
////Integer division with truncation towards the negative infinity.
÷Used to denote division in mathematical equations where no truncation or rounding is intended.
%Modulus operator. Defined only for positive numbers.
Sign( )
Abs( )
The summation of the f(i) with i taking integral values from a up to, but not including b.
4.2Logical operators
||Logical OR.
Logical AND.
!Logical NOT.
4.3Relational operators
Greater than.
>=Greater than or equal to.
Greater than or equal to.
Less than.
<=Less than or equal to.
Less than or equal to.
==Equal to.
!=Not equal to.
max [, …,] the maximum value in the argument list.
min [, … ,] the minimum value in the argument list.
4.4Assignment
=Assignment operator.
4.5Mnemonics
The following mnemonics are defined to describe the different data types used in the coded bitstream.
bslbfBit string, left bit first, where “left” is the order in which bit strings are written in this part of ISO/IEC 14496. Bit strings are generally written as a string of 1s and 0s within single quote marks, e.g. ‘1000 0001’. Blanks within a bit string are for ease of reading and have no significance. For convenience large strings are occasionally written in hexadecimal, in this case conversion to a binary in the conventional manner will yield the value of the bit string. Thus the left most hexadecimal digit is first and in each hexadecimal digit the most significant of the four bits is first.
uimsbfUnsigned integer, most significant bit first.
simsbfSigned integer, in twos complement format, most significant (sign) bit first.
vlclbfVariable length code, left bit first, where “left” refers to the order in which the VLC codes are written. The byte order of multibyte words is most significant byte first.
vuimsbfVariable length code unsigned integer, most significant bit first. If the number of bits to represent the integer number exceeds 4 then the first n bits (Ext) which are 1 except of the n-th bit which is 0, indicates that the Position Code is encoded by n times 4 bits. This is shown as an informative example in Figure 1 - Informative example for the vuimsbf data type.
Figure 1 - Informative example for the vuimsbf data type
4.6Conventions
4.6.1Method of describing bitstream syntax
The bitstream retrieved by the decoder is described in Clause 7. Each data item in the bitstream is in bold type. It is described by its name, its length in bits, and a mnemonic for its type and order of transmission.
The action caused by a decoded data element in a bitstream depends on the value of that data element and on data elements previously decoded. The following constructs are used to express the conditions when data elements are present, and are in normal type:
while ( condition ) { / If the condition is true, then the group of data elementsdata_element / occurs next in the data stream. This repeats until the
. . . / condition is not true.
}
do {
data_element / The data element always occurs at least once.
. . .
} while ( condition ) / The data element is repeated until the condition is not true.
if ( condition ) { / If the condition is true, then the first group of data
data_element / elements occurs next in the data stream.
. . .
} else { / If the condition is not true, then the second group of data
data_element / elements occurs next in the data stream.
. . .
}
for ( i = m; i < n; i++) { / The group of data elements occurs (n-m) times. Conditional
data_element / constructs within the group of data elements may depend
. . . / on the value of the loop control variable i, which is set to
} / m for the first occurrence, incremented by one for
the second occurrence, and so forth.
/* comment */ / Explanatory comment that may be deleted entirely without
in any way altering the syntax.
This syntax uses the ‘C-code’ convention that a variable or expression evaluating to a non-zero value is equivalent to a condition that is true and a variable or expression evaluating to a zero value is equivalent to a condition that is false
data_element [n]data_element [n] is the n+1th element of an array of data.
data_element [m][n]data_element [m][n] is the m+1, n+1th element of a two-dimensional array of data.
data_element [l][m][n]data_element [l][m][n] is the l+1, m+1, n+1th element of a three-dimensional array of data.
4.6.2Definition of bytealigned() function
The function bytealigned() returns 1 if the current position is on a byte boundary, that is the next bit in the bitstream is the first bit in a byte. Otherwise it returns 0.
4.6.3Reserved, forbidden and marker_bit
The terms “reserved” and “forbidden” are used in the description of some values of several fields in the coded bitstream.
The term “reserved” indicates that the value may be used in the future for ISO/IEC defined extensions.
The term “forbidden” indicates a value that shall never be used (usually in order to avoid emulation of start codes).
The term “marker_bit” indicates a one bit integer in which the value zero is forbidden (and it therefore shall have the value ‘1’). These marker bits are introduced at several points in the syntax to avoid start code emulation.
The term “zero_bit” indicates a one bit integer with the value zero.
5Systems architecture
5.1Terminal architecture
The information representation specified in the MPEG-7 standard provides the means to represent coded multimedia content description information. The entity that makes used of such coded representation of the multimedia content is generically referred to as "terminal". This terminal may correspond to a standalone application or be part of an application system.
The objective of this section is to provide the description of a terminal making use of MPEG-7 representations. The architecture of such terminal is depicted in Figure 2 and its overall operation is described in this section. The following sections further described the overall operation of the tools specified in this part of the MPEG-7 specification.
Figure 2 - MPEG-7 Architecture
At the bottom of Figure 2 is the transmission/storage medium. This refers to the lower layers of the delivery infrastructure (network layer and below, as well as storage). These layers deliver multiplexed streams to the Delivery layer. The transport of the MPEG-7 data can occur on a variety of delivery systems. This includes for example MPEG-2 Transport Streams, IP (Internet Protocol), or MPEG-4 (MP4) files or streams. The delivery layer encompasses mechanisms allowing synchronization, framing and multiplexing of MPEG-7 content. MPEG-7 content may be delivered independently or together with the content they describe. The delivery of MPEG-7 content on particular systems is outside the scope of this specification. Not all MPEG-7 streams have to be downstream (server to the client). The MPEG-7 architecture allows to convey data back from the terminal to the transmitter or server, such as queries or request.
The Delivery layer provides to the Compression layer MPEG-7 elementary streams. MPEG-7 elementary streams consist in consecutive individually accessible portion of data named Access Units. An access unit is the smallest data entity to which timing information can be attributed. MPEG-7 elementary streams contain information of different nature:
- Schema information: this information defines the structure of the MPEG-7 description;
- Descriptions information: this information is either the complete description of the multimedia content or fragments of the description.
The delivery layer of a complete application may also be capable of providing the multimedia content data if requested. Such delivery mechanisms is outside the scope of this specification. Existing delivery tools may be used for this purpose.