[MS-RTPME]:

Real-Time Transport Protocol (RTP/RTCP): Microsoft Extensions

Intellectual Property Rights Notice for Open Specifications Documentation

§  Technical Documentation. Microsoft publishes Open Specifications documentation (“this documentation”) for protocols, file formats, data portability, computer languages, and standards support. Additionally, overview documents cover inter-protocol relationships and interactions.

§  Copyrights. This documentation is covered by Microsoft copyrights. Regardless of any other terms that are contained in the terms of use for the Microsoft website that hosts this documentation, you can make copies of it in order to develop implementations of the technologies that are described in this documentation and can distribute portions of it in your implementations that use these technologies or in your documentation as necessary to properly document the implementation. You can also distribute in your implementation, with or without modification, any schemas, IDLs, or code samples that are included in the documentation. This permission also applies to any documents that are referenced in the Open Specifications documentation.

§  No Trade Secrets. Microsoft does not claim any trade secret rights in this documentation.

§  Patents. Microsoft has patents that might cover your implementations of the technologies described in the Open Specifications documentation. Neither this notice nor Microsoft's delivery of this documentation grants any licenses under those patents or any other Microsoft patents. However, a given Open Specifications document might be covered by the Microsoft Open Specifications Promise or the Microsoft Community Promise. If you would prefer a written license, or if the technologies described in this documentation are not covered by the Open Specifications Promise or Community Promise, as applicable, patent licenses are available by contacting .

§  Trademarks. The names of companies and products contained in this documentation might be covered by trademarks or similar intellectual property rights. This notice does not grant any licenses under those rights. For a list of Microsoft trademarks, visit www.microsoft.com/trademarks.

§  Fictitious Names. The example companies, organizations, products, domain names, email addresses, logos, people, places, and events that are depicted in this documentation are fictitious. No association with any real company, organization, product, domain name, email address, logo, person, place, or event is intended or should be inferred.

Reservation of Rights. All other rights are reserved, and this notice does not grant any rights other than as specifically described above, whether by implication, estoppel, or otherwise.

Tools. The Open Specifications documentation does not require the use of Microsoft programming tools or programming environments in order for you to develop an implementation. If you have access to Microsoft programming tools and environments, you are free to take advantage of them. Certain Open Specifications documents are intended for use in conjunction with publicly available standards specifications and network programming art and, as such, assume that the reader either is familiar with the aforementioned material or has immediate access to it.

Revision Summary

Date / Revision History / Revision Class / Comments /
4/8/2008 / 0.1 / New / Version 0.1 release
6/20/2008 / 1.0 / Major / Updated and revised the technical content.
7/25/2008 / 1.0.1 / Editorial / Changed language and formatting in the technical content.
8/29/2008 / 1.0.2 / Editorial / Changed language and formatting in the technical content.
10/24/2008 / 1.0.3 / Editorial / Changed language and formatting in the technical content.
12/5/2008 / 1.1 / Minor / Clarified the meaning of the technical content.
1/16/2009 / 1.2 / Minor / Clarified the meaning of the technical content.
2/27/2009 / 1.3 / Minor / Clarified the meaning of the technical content.
4/10/2009 / 1.3.1 / Editorial / Changed language and formatting in the technical content.
5/22/2009 / 1.3.2 / Editorial / Changed language and formatting in the technical content.
7/2/2009 / 1.3.3 / Editorial / Changed language and formatting in the technical content.
8/14/2009 / 1.3.4 / Editorial / Changed language and formatting in the technical content.
9/25/2009 / 1.4 / Minor / Clarified the meaning of the technical content.
11/6/2009 / 1.4.1 / Editorial / Changed language and formatting in the technical content.
12/18/2009 / 1.4.2 / Editorial / Changed language and formatting in the technical content.
1/29/2010 / 1.4.3 / Editorial / Changed language and formatting in the technical content.
3/12/2010 / 1.4.4 / Editorial / Changed language and formatting in the technical content.
4/23/2010 / 1.4.5 / Editorial / Changed language and formatting in the technical content.
6/4/2010 / 1.4.6 / Editorial / Changed language and formatting in the technical content.
7/16/2010 / 1.4.6 / None / No changes to the meaning, language, or formatting of the technical content.
8/27/2010 / 1.4.6 / None / No changes to the meaning, language, or formatting of the technical content.
10/8/2010 / 1.4.6 / None / No changes to the meaning, language, or formatting of the technical content.
11/19/2010 / 1.5 / Minor / Clarified the meaning of the technical content.
1/7/2011 / 1.5 / None / No changes to the meaning, language, or formatting of the technical content.
2/11/2011 / 1.5 / None / No changes to the meaning, language, or formatting of the technical content.
3/25/2011 / 1.5 / None / No changes to the meaning, language, or formatting of the technical content.
5/6/2011 / 1.5 / None / No changes to the meaning, language, or formatting of the technical content.
6/17/2011 / 1.6 / Minor / Clarified the meaning of the technical content.
9/23/2011 / 1.6 / None / No changes to the meaning, language, or formatting of the technical content.
12/16/2011 / 1.6 / None / No changes to the meaning, language, or formatting of the technical content.
3/30/2012 / 1.6 / None / No changes to the meaning, language, or formatting of the technical content.
7/12/2012 / 1.6 / None / No changes to the meaning, language, or formatting of the technical content.
10/25/2012 / 1.6 / None / No changes to the meaning, language, or formatting of the technical content.
1/31/2013 / 1.6 / None / No changes to the meaning, language, or formatting of the technical content.
8/8/2013 / 1.6 / None / No changes to the meaning, language, or formatting of the technical content.
11/14/2013 / 1.6 / None / No changes to the meaning, language, or formatting of the technical content.
2/13/2014 / 1.6 / None / No changes to the meaning, language, or formatting of the technical content.
5/15/2014 / 1.6 / None / No changes to the meaning, language, or formatting of the technical content.
6/30/2015 / 1.6 / None / No changes to the meaning, language, or formatting of the technical content.
10/16/2015 / 1.6 / None / No changes to the meaning, language, or formatting of the technical content.
7/14/2016 / 1.6 / None / No changes to the meaning, language, or formatting of the technical content.

Table of Contents

1 Introduction 6

1.1 Glossary 6

1.2 References 9

1.2.1 Normative References 9

1.2.2 Informative References 9

1.3 Overview 10

1.4 Relationship to Other Protocols 10

1.5 Prerequisites/Preconditions 10

1.6 Applicability Statement 11

1.7 Versioning and Capability Negotiation 11

1.8 Vendor-Extensible Fields 11

1.9 Standards Assignments 11

2 Messages 12

2.1 Transport 12

2.1.1 Confidentiality 12

2.2 Message Syntax 12

2.2.1 RTP Packets 12

2.2.2 RTCP Compound Packets 13

2.2.3 RTCP Probe Packet 13

2.2.4 RTCP Packet Pair 13

2.2.5 RTCP Sender Report (SR) 13

2.2.6 RTCP SDES 14

2.2.7 RTCP Profile-Specific Extension 14

2.2.7.1 RTCP Profile-Specific Extension for Estimated Bandwidth 14

3 Protocol Details 16

3.1 RTP Details 16

3.1.1 Abstract Data Model 16

3.1.2 Timers 16

3.1.3 Initialization 16

3.1.4 Higher-Layer Triggered Events 17

3.1.5 Message Processing Events and Sequencing Rules 17

3.1.6 Timer Events 17

3.1.7 Other Local Events 17

3.2 RTCP Details 17

3.2.1 Abstract Data Model 18

3.2.2 Timers 18

3.2.3 Initialization 18

3.2.4 Higher-Layer Triggered Events 19

3.2.5 Message Processing Events and Sequencing Rules 19

3.2.6 Timer Events 19

3.2.7 Other Local Events 20

4 Protocol Examples 21

4.1 SSRC Change Throttling 21

4.2 Bandwidth Estimation 21

4.3 Key Derivation 22

5 Security 24

5.1 Security Considerations for Implementers 24

5.2 Index of Security Parameters 24

6 Appendix A: Product Behavior 25

7 Change Tracking 26

8 Index 27

1  Introduction

This document specifies the Real-Time Transport Protocol (RTP/RTCP) Microsoft Extensions (RTPME), a set of extensions to the base Real-Time Transport Protocol (RTP) specified in [RFC3550]. RTP is a set of network transport functions suitable for applications transmitting real-time data, such as audio and video, across multimedia endpoints.

Sections 1.5, 1.8, 1.9, 2, and 3 of this specification are normative. All other sections and examples in this specification are informative.

1.1  Glossary

This document uses the following terms:

base64 encoding: A binary-to-text encoding scheme whereby an arbitrary sequence of bytes is converted to a sequence of printable ASCII characters, as described in [RFC4648].

cipher block chaining (CBC): A method of encrypting multiple blocks of plaintext with a block cipher such that each ciphertext block is dependent on all previously processed plaintext blocks. In the CBC mode of operation, the first block of plaintext is XOR'd with an Initialization Vector (IV). Each subsequent block of plaintext is XOR'd with the previously generated ciphertext block before encryption with the underlying block cipher. To prevent certain attacks, the IV must be unpredictable, and no IV should be used more than once with the same key. CBC is specified in [SP800-38A] section 6.2.

codec: An algorithm that is used to convert media between digital formats, especially between raw media data and a format that is more suitable for a specific purpose. Encoding converts the raw data to a digital format. Decoding reverses the process.

conference: A Real-Time Transport Protocol (RTP) session that includes more than one participant.

connectionless protocol: A transport protocol that enables endpoints (5) to communicate without a previous connection arrangement and that treats each packet independently as a datagram. Examples of connectionless protocols are Internet Protocol (IP) and User Datagram Protocol (UDP).

connection-oriented transport protocol: A transport protocol that enables endpoints (5) to communicate after first establishing a connection and that treats each packet according to the connection state. An example of a connection-oriented transport protocol is Transmission Control Protocol (TCP).

contributing source (CSRC): A source of a stream of RTP packets that has contributed to the combined stream produced by an RTP mixer. The mixer inserts a list of the synchronization source (SSRC) identifiers of the sources that contributed to the generation of a particular packet into the RTP header of that packet. This list is called the CSRC list. An example application is audio conferencing where a mixer indicates all the talkers whose speech was combined to produce the outgoing packet, allowing the receiver to indicate the current talker, even though all the audio packets contain the same SSRC identifier (that of the mixer). See [RFC3550] section 3.

Data Encryption Standard (DES): A specification for encryption of computer data that uses a 56-bit key developed by IBM and adopted by the U.S. government as a standard in 1976. For more information see [FIPS46-3].

datagram: A style of communication offered by a network transport protocol where each message is contained within a single network packet. In this style, there is no requirement for establishing a session prior to communication, as opposed to a connection-oriented style.

Dual Tone Multiple Frequency (DTMF): The signaling system used in telephony systems, in which each digit is associated with two specific frequencies. Most commonly associated with telephone touch-tone keypads.

dual-tone multi-frequency (DTMF): In telephony systems, a signaling system in which each digit is associated with two specific frequencies. This system typically is associated with touch-tone keypads for telephones.

encryption: In cryptography, the process of obscuring information to make it unreadable without special knowledge.

forward error correction (FEC): A process in which a sender uses redundancy to enable a receiver to recover from packet loss.

jitter: A variation in a network delay that is perceived by the receiver of each packet.

MD5: A one-way, 128-bit hashing scheme that was developed by RSA Data Security, Inc., as described in [RFC1321].

message digest algorithm 5 (MD5): A cryptographic hash function that generates 128 bits of hash value as specified in [RFC1321].

multimedia session: A set of concurrent RTP sessions among a common group of participants. For example, a video conference (which is a multimedia session) can contain an audio RTP session and a video RTP session. See [RFC3550] section 3.

non-RTP means: Protocols and mechanisms that might be needed in addition to RTP to provide a usable service. In particular, for multimedia conferences, a control protocol can distribute multicast addresses and keys for encryption, negotiate the encryption algorithm to be used, and define dynamic mappings between RTP payload type values and the payload formats they represent for formats that do not have a predefined payload type value. Examples of such protocols include the Session Initiation Protocol (SIP) ([RFC3261]), ITU Recommendation H.323, and applications using SDP ([RFC2327]), such as RTSP ([RFC2326]). For simple applications, electronic mail or a conference database can also be used. See [RFC3550] section 3.

packetization time (P-time): The amount, in milliseconds, of audio data that is sent in a single Real-Time Transport Protocol (RTP) packet.

participant: A user who is participating in a conference or peer-to-peer call, or the object that is used to represent that user.

port: The abstraction that transport protocols use to distinguish among multiple destinations within a given host computer. TCP/IP protocols identify ports by using small positive integers. The transport selectors (TSEL) used by the OSI transport layer are equivalent to ports. RTP depends upon the lower-layer protocol to provide some mechanism such as ports to multiplex the RTP and RTCP packets of a session. For more information, see [RFC3550] section 3.

Real-Time Transport Protocol (RTP): A network transport protocol that provides end-to-end transport functions that are suitable for applications that transmit real-time data, such as audio and video, as described in [RFC3550].

RTCP packet: A control packet consisting of a fixed header part similar to that of RTP packets, followed by structured elements that vary depending upon the RTCP packet type. Typically, multiple RTCP packets are sent together as a compound RTCP packet in a single packet of the underlying protocol; this is enabled by the length field in the fixed header of each RTCP packet. See [RFC3550] section 3.