INTERNATIONAL ORGANIZATION FOR STANDARDIZATION

ORGANISATION INTERNATIONALE NORMALISATION

ISO/IEC JTC 1/SC 29/WG 11

CODING OF MOVING PICTURES AND AUDIO

ISO/IEC JTC 1/SC 29/WG 11N7935

Jan 2006, Bangkok, Thailand

Source: / Audio Subgroup
Title: / Text of ISO/IEC 14496-3:2005/PDAM 5, BSAC extensions
Status: / Approved


ISO/IECJTC1/SC29N

Date:2006-01-22

ISO/IEC14496-3:200X/PDAM5

ISO/IECJTC1/SC29/WG11

Secretariat:

Information technology— Coding of audio-visual objects— Part3: Audio, AMENDMENT 5: BSAC extensions

Élément introductif— Élément central— Partie3: Titre de la partie

Warning

This document is not an ISO International Standard. It is distributed for review and comment. It is subject to change without notice and may not be referred to as an International Standard.

Recipients of this draft are invited to submit, with their comments, notification of any relevant patent rights of which they are aware and to provide supporting documentation.

ISO/IEC14496-3:200X/PDAM5

Copyright notice

This ISO document is a working draft or committee draft and is copyright-protected by ISO. While the reproduction of working drafts or committee drafts in any form for use by participants in the ISO standards development process is permitted without prior permission from ISO, neither this document nor any extract from it may be reproduced, stored or transmitted in any form for any other purpose without prior written permission from ISO.

Requests for permission to reproduce this document for the purpose of selling it should be addressed as shown below or to ISO's member body in the country of the requester:

[Indicate the full address, telephone number, fax number, telex number, and electronic mail address, as appropriate, of the Copyright Manger of the ISO member body responsible for the secretariat of the TC or SC within the framework of which the working document has been prepared.]

Reproduction for sales purposes may be subject to royalty payments or a licensing agreement.

Violators may be prosecuted.

Foreword

ISO (the International Organization for Standardization) and IEC (the International Electrotechnical Commission) form the specialized system for worldwide standardization. National bodies that are members of ISO or IEC participate in the development of International Standards through technical committees established by the respective organization to deal with particular fields of technical activity. ISO and IEC technical committees collaborate in fields of mutual interest. Other international organizations, governmental and non-governmental, in liaison with ISO and IEC, also take part in the work. In the field of information technology, ISO and IEC have established a joint technical committee, ISO/IECJTC1.

International Standards are drafted in accordance with the rules given in the ISO/IECDirectives, Part2.

The main task of the joint technical committee is to prepare International Standards. Draft International Standards adopted by the joint technical committee are circulated to national bodies for voting. Publication as an International Standard requires approval by at least 75% of the national bodies casting a vote.

Attention is drawn to the possibility that some of the elements of this document may be the subject of patent rights. ISO and IEC shall not be held responsible for identifying any or all such patent rights.

Amendment5 to ISO/IEC144963:200X was prepared by Joint Technical Committee ISO/IECJTC1, Information technology, Subcommittee SC29, Coding of Audio, Picture, Multimedia and Hypermedia Information.

This Amendment specifies the normative syntax of the integration of ER-BSAC and SBR and the decoding process. An informative encoder description is also given.

©ISO/IEC2006— All rights reserved / iii

ISO/IEC14496-3:200X/PDAM5

Information technology— Coding of audio-visual objects— Part3: Audio, AMENDMENT 5: BSAC extensions

Amendment Subpart 1

1  SBR object type

In Part 3: Audio, Subpart 1, in subclause 1.5.1.2.6 SBR object type, replace Table 1.2A with the table below:

Table 1.2A – Audio object types that can be combined with the SBR Tool

Audio Object Type / Combination with SBR Tool permitted / Object Type ID
Null / 0
AAC main / x / 1
AAC LC / X / 2
AAC SSR / X / 3
AAC LTP / X / 4
SBR / 5
AAC Scalable / X / 6
TwinVQ / 7
CELP / 8
HVXC / 9
(Reserved) / 10
(Reserved) / 11
TTSI / 12
Main synthetic / 13
Wavetable synthesis / 14
General MIDI / 15
Algorithmic Synthesis and Audio FX / 16
ER AAC LC / X / 17
(Reserved) / 18
ER AAC LTP / X / 19
ER AAC scalable / X / 20
ER TwinVQ / 21
ER BSAC / X / 22
ER AAC LD / 23
ER CELP / 24
ER HVXC / 25
ER HILN / 26
ER Parametric / 27
SSC / 28
PS / 29
(Reserved) / 30
(Reserved) / 31

In Part 3: Audio, Subpart 1, in subclause 1.6.2.1 AudioSpecificConfig, replace table 1.8 with the table below:

Table 1.8 – Syntax of AudioSpecificConfig()

Syntax / No. of bits / Mnemonic
AudioSpecificConfig ()
{
audioObjectType = GetAudioObjectType();
samplingFrequencyIndex; / 4 / uimsbf
if ( samplingFrequencyIndex==0xf )
samplingFrequency; / 24 / uimsbf
channelConfiguration; / 4 / uimsbf
sbrPresentFlag = -1;
psPresentFlag = -1;
if ( audioObjectType == 5 ||
audioObjectType == 29) {
extensionAudioObjectType = audioObjectType;
sbrPresentFlag = 1;
if ( audioObjectType == 29 ) {
psPresentFlag = 1;
}
extensionSamplingFrequencyIndex; / 4 / uimsbf
if ( extensionSamplingFrequencyIndex==0xf )
extensionSamplingFrequency; / 24 / uimsbf
audioObjectType = GetAudioObjectType();
If ( audioObjectType == 22 )
extensionChannelConfiguration / 4 / uimsbf
}
else {
extensionAudioObjectType = 0;
}
if ( audioObjectType == 1 || audioObjectType == 2 ||
audioObjectType == 3 || audioObjectType == 4 ||
audioObjectType == 6 || audioObjectType == 7 )
GASpecificConfig();
if ( audioObjectType == 8 )
CelpSpecificConfig();
if ( audioObjectType == 9 )
HvxcSpecificConfig();
if ( audioObjectType == 12 )
TTSSpecificConfig();
if ( audioObjectType == 13 || audioObjectType == 14 ||
audioObjectType == 15 || audioObjectType==16)
StructuredAudioSpecificConfig();
/* the following Objects are Amendment 1 Objects */
if ( audioObjectType == 17 || audioObjectType == 19 ||
audioObjectType == 20 || audioObjectType == 21 ||
audioObjectType == 22 || audioObjectType == 23 )
GASpecificConfig();
if ( audioObjectType == 24)
ErrorResilientCelpSpecificConfig();
if ( audioObjectType == 25)
ErrorResilientHvxcSpecificConfig();
if ( audioObjectType == 26 || audioObjectType == 27)
ParametricSpecificConfig();
if ( audioObjectType == 17 || audioObjectType == 19 ||
audioObjectType == 20 || audioObjectType == 21 ||
audioObjectType == 22 || audioObjectType == 23 ||
audioObjectType == 24 || audioObjectType == 25 ||
audioObjectType == 26 || audioObjectType == 27 ) {
epConfig; / 2 / uimsbf
if ( epConfig == 2 || epConfig == 3 ) {
ErrorProtectionSpecificConfig();
}
if ( epConfig == 3 ) {
directMapping; / 1 / uimsbf
if ( ! directMapping ) {
/* tbd */
}
}
}
if ( extensionAudioObjectType != 5 &
bits_to_decode() >= 16 ) {
syncExtensionType; / 11 / bslbf
if (syncExtensionType == 0x2b7) {
extensionAudioObjectType = GetAudioObjectType();
if ( extensionAudioObjectType == 5 ) {
sbrPresentFlag; / 1 / uimsbf
if (sbrPresentFlag == 1) {
extensionSamplingFrequencyIndex; / 4 / uimsbf
if ( extensionSamplingFrequencyIndex == 0xf ) {
extensionSamplingFrequency; / 24 / uimsbf
}
if ( bits_to_decode() >= 12 ) {
syncExtensionType;
if (syncExtensionType == 0x548) {
psPresentFlag;
}
}
}
}
if ( extensionAudioObjectType == 22 ) {
sbrPresentFlag; / 1 / uimsbf
if (sbrPresentFlag == 1) {
extensionSamplingFrequencyIndex; / 4 / uimsbf
if ( extensionSamplingFrequencyIndex == 0xf )
extensionSamplingFrequency; / 24 / uimsbf
}
extensionChannelConfiguration / 4 / uimsbf
}
}
}
}

In Part 3: Audio, Subpart 1, after 1.6.3.14 psPresentFlag, add:

1.6.3.15 extensionChannelConfiguration

A four bit field indicating the channel configuration of the BSAC multichannel extension, according to Table 1.11.

In Part 3: Audio, Subpart 1, after 1.6.3.14 psPresentFlag, add:

1.1.7  Implicit and Explicit signaling for BSAC extension payloads

The implicit signaling method for BSAC extension payloads is similar to that of SBR and PS tool. The BSAC decoder which can decode the BSAC extension payloads will check if there is the extension type which is related with the SBR tool such as ‘EXT_BSAC_SBR_DATA’ and ‘EXT_BSAC_SBR_ DATA_CRC’ in the bsac_raw_data_block(). If it’s detected and the SBR tool is operated in dual-rate mode, the sampling frequency will be updated. In addition, the BSAC decoder which can decode the BSAC extension payloads will check if there is the extension type which is related with the BSAC channel extension such as ‘EXT_BSAC_CHANNEL’ in the bsac_raw_ data_block(). If it’s detected, the number of channel from the AudioSpecificConfig() for BSAC Audio Object Type will be updated depending on the ‘channel_configu-ration_index’ of each extended_bsac_base_element().

When explicit signaling is used, implicit signaling shall not occur. Two different types of explicit signaling are available:

1.  Explicit Signaling Method 1: hierarchical signaling

If the first audioObjectType (AOT) signaled is the SBR AOT, a second audio object type is signaled which indicates the underlying audio object type. This signaling method is not backward compatible. If the second audioObjectType is the ER BSAC AOT, the extensionChannelConfiguration indicates the total number of channel in the bsac_raw_data_block().

2.  Explicit Signaling Method 2 : backward compatible signaling

The extensionAudioObjectType is signaled at the end of the AudioSpecificConfig(). If the extensionAudioObjectType is the ER BSAC AOT, the extensionChannelConfiguration indicates the total number of channel in the bsac_raw_data_block(). This method shall only be used in systems that convey the length of the AudioSpecificConfig(). Hence, it shall not be used for LATM with audioMuxVersion==0.

The Table 1.23 explains the decoder behavior with SBR and BSAC channel extension signaling.


Table 1.23 – SBR and BSAC channel extension signaling and Corresponding Decoder Behavior

Bitstream characteristics / Decoder behavior
Extension
AudioObjectType / sbrPresentFlag / extensionChannelConfiguration / raw_data_block / BSAC decoder / Extended BSAC decoder
!= ER_BSAC
( Implicit Signaling ) / -1
(Note 1) / Not available / BSAC / Play BSAC / Play BSAC
BSAC+SBR / Play BSAC / Play at least BSAC,
should play BSAC+SBR
BSAC+MC / Play BSAC / Play at least BSAC,
should play BSAC+MC
BSAC+SBR+MC / Play BSAC / Play at least BSAC,
should play BSAC+SBR +MC
== ER_BSAC ( Explicit Signaling ) / 0
(Note 2) / == channelConfiguration
(Note 4) / BSAC / Play BSAC / Play BSAC
!= channelConfiguration / BSAC + MC / Play BSAC / Play BSAC+MC
1
(Note 3) / == channelConfiguration
(Note 4) / BSAC+SBR / Play BSAC / Play BSAC+SBR
!= channelConfiguration / BSAC+SBR+MC / Play BSAC / Play BSAC+SBR +MC
Note 1: Implicit signaling, check payload in order to determine output sampling frequency, or assume the presence of SBR data in the payload, giving an output sampling frequency of twice the sampling frequency indicated by samplingFrequency in the AudioSpecificConfig() (unless the down sampled SBR Tool is operated, or twice the sampling frequency indicated by samplingFrequency exceeds the maximum allowed output sampling frequency of the current level, in which case the output sampling frequency is the same as indicated by samplingFrequency).
Note 2: Explicitly signals that there is no SBR data, hence no implicit signaling is present, and the output sampling frequency is given by samplingFrequency in the AudioSpecificConfig().
Note 3: Output sampling frequency is the extensionSamplingFrequency in AudioSpecificConfig().
Note 4: Explicitly signals that there is no BSAC channel extension data, and the number of output channel is given by channelConfiguration in the AudioSpecificConfig().

Amendment Subpart 4

2  Scope

2.1  Introduction

This International Standard describes the Integration of ER-BSAC and SBR, that is capable of fine grain scalable reproduction with bandwidth extension. In the preferred modes of operating the integration of ER-BSAC and SBR, the high frequency sound components can be either a ER-BSAC enhancement layer signal or synthesized SBR signal.

2.2  Technical overview

The basic structure of the integration of ER-BSAC and SBR is shown in Figure 1.

Figure 1 – Overview of the integration of ER-BSAC and SBR decoder

3  Syntax

3.1  Extension Syntax

Replace the definition of bsac_raw_data_block() in ISO/IEC 14496-3:2005 Part 3: Audio, Subpart 4, Subclause 4.4.2.6 Payloads for the audio object type ER BSAC, Table 4.33

Table 4.33 – Syntax of bsac_raw_data_block()

Syntax / No. of bits / Mnemonic
bsac_raw_data_block()
{
bsac_base_element();
layer=slayer_size;
while(data_available() & layer<(top_layer+slayer_size)) {
Bsac_layer_element(layer);
Layer++;
}
byte_alignment();
if (data_available()) {
zero_code / 32 / bslbf
sync_word / 4 / bslbf
While( data_available() ) {
extension_type / 4 / bslbf
switch(extension_type) {
case EXT_BSAC_CHANNEL :
extended_bsac_raw_data_block();
case EXT_BSAC_SBR_DATA :
extended_bsac_sbr_data(nch, 0);
case EXT_BSAC_SBR_DATA_CRC :
extended_bsac_sbr_data(nch, 1);
case EXT_BSAC_CHANNEL_SBR :
extended_bsac_raw_data_block();
extended_bsac_sbr_data(nch, 0);
case EXT_BSAC_CHANNEL_SBR_CRC :
extended_bsac_raw_data_block();
extended_bsac_sbr_data(nch, 1);
}
}
}
}

In Part 3: Audio, Subpart 4, under Bitstream elements in subclause 4.5.2.6.2.1 Definitions, after bsac_raw_data_block, add the following:

sync_word a four bit code that identifies the start of the extended part. The bit string ‘1111’.

extension_type a four bit code that identifies the extension type according to Table A.1..

Table1 BSAC extension_type

Symbol / Value of extension_type / Purpose
EXT_BSAC_CHANNEL / ‘1111’ / BSAC channel extension
EXT_BSAC_SBR_DATA / ‘0000’ / BSAC SBR enhancement
EXT_BSAC_SBR_DATA_CRC / ‘0001’ / SBR enhancement with CRC
EXT_BSAC_CHANNEL_SBR / ‘1110’ / BSAC channel extension with SBR
EXT_BSAC_CHANNEL_SBR_CRC / ‘1101’ / BSAC channel extension with SBR_CRC
RESERVED / ‘0010’ ~ ’1100’ / reserved

3.2  Scalable SBR Syntax

Add the following subclause after 4.4.2.8 in Part 3 : Audio, Subpart 4

3.2.1  SBR payloads for the audio object type ER BSAC

Table2 – Syntax of extended_bsac_sbr_data ()

Syntax / No. of bits / Mnemonic
extended_bsac_sbr_data(nch, crc_flag)
{
num_sbr_bits = 0;
cnt = count; / 4 / uimsbf
num_sbr_bits += 4;
if (cnt == 15) {
cnt += esc_count - 1; / 8 / uimsbf
num_sbr_bits += 8;
}
if (crc_flag) {
bs_sbr_crc_bits; / 10 / uimsbf
num_sbr_bits += 10;
}
num_sbr_bits += 1;
if (bs_header_flag) / 1 / uimsbf
num_sbr_bits += sbr_header();
num_sbr_bits += bsac_sbr_data(nch, bs_amp_res);
num_align_bits = (8*cnt - num_sbr_bits)%8;
bs_fill_bits; / num_align_bits / uimsbf
return ((num_sbr_bits + num_align_bits ) / 8)
}

Table3 – Syntax of bsac_sbr_data()