[MS-MSSOD]:

Media Streaming Server Protocols Overview

Intellectual Property Rights Notice for Open Specifications Documentation

Technical Documentation. Microsoft publishes Open Specifications documentation (“this documentation”) for protocols, file formats, data portability, computer languages, and standards support. Additionally, overview documents cover inter-protocol relationships and interactions.

Copyrights. This documentation is covered by Microsoft copyrights. Regardless of any other terms that are contained in the terms of use for the Microsoft website that hosts this documentation, you can make copies of it in order to develop implementations of the technologies that are described in this documentation and can distribute portions of it in your implementations that use these technologies or in your documentation as necessary to properly document the implementation. You can also distribute in your implementation, with or without modification, any schemas, IDLs, or code samples that are included in the documentation. This permission also applies to any documents that are referenced in the Open Specifications documentation.

No Trade Secrets. Microsoft does not claim any trade secret rights in this documentation.

Patents. Microsoft has patents that might cover your implementations of the technologies described in the Open Specifications documentation. Neither this notice nor Microsoft's delivery of this documentation grants any licenses under those patents or any other Microsoft patents. However, a given Open Specifications document might be covered by the Microsoft Open Specifications Promise or the Microsoft Community Promise. If you would prefer a written license, or if the technologies described in this documentation are not covered by the Open Specifications Promise or Community Promise, as applicable, patent licenses are available by contacting .

License Programs. To see all of the protocols in scope under a specific license program and the associated patents, visit the Patent Map.

Trademarks. The names of companies and products contained in this documentation might be covered by trademarks or similar intellectual property rights. This notice does not grant any licenses under those rights. For a list of Microsoft trademarks, visit

Fictitious Names. The example companies, organizations, products, domain names, email addresses, logos, people, places, and events that are depicted in this documentation are fictitious. No association with any real company, organization, product, domain name, email address, logo, person, place, or event is intended or should be inferred.

Reservation of Rights. All other rights are reserved, and this notice does not grant any rights other than as specifically described above, whether by implication, estoppel, or otherwise.

Tools. The Open Specifications documentation does not require the use of Microsoft programming tools or programming environments in order for you to develop an implementation. If you have access to Microsoft programming tools and environments, you are free to take advantage of them. Certain Open Specifications documents are intended for use in conjunction with publicly available standards specifications and network programming art and, as such, assume that the reader either is familiar with the aforementioned material or has immediate access to it.

Support. For questions and support, please contact .

Revision Summary

Date / Revision History / Revision Class / Comments
3/30/2012 / 1.0 / New / Released new document.
7/12/2012 / 1.0 / None / No changes to the meaning, language, or formatting of the technical content.
10/25/2012 / 1.0 / None / No changes to the meaning, language, or formatting of the technical content.
1/31/2013 / 1.0 / None / No changes to the meaning, language, or formatting of the technical content.
8/8/2013 / 2.0 / Major / Updated and revised the technical content.
11/14/2013 / 2.0 / None / No changes to the meaning, language, or formatting of the technical content.
2/13/2014 / 2.0 / None / No changes to the meaning, language, or formatting of the technical content.
5/15/2014 / 2.0 / None / No changes to the meaning, language, or formatting of the technical content.
6/30/2015 / 3.0 / Major / Significantly changed the technical content.
10/16/2015 / 3.0 / None / No changes to the meaning, language, or formatting of the technical content.
9/26/2016 / 4.0 / Major / Significantly changed the technical content.
6/1/2017 / 4.0 / None / No changes to the meaning, language, or formatting of the technical content.

Table of Contents

1Introduction

1.1Conceptual Overview

1.2Glossary

1.3References

2Functional Architecture

2.1Overview

2.1.1System Purpose

2.1.2System Components

2.1.3Applicability

2.1.4Relevant Standards

2.1.5Protocol Relationship

2.1.5.1RTSP-WME: Logical Dependencies and Relationship to Other Protocols

2.1.5.2MMSP: Logical Dependencies and Relationship to Other Protocols

2.1.5.3MSB: Relationship to Other Protocols

2.1.5.4MSBD: Logical Dependencies and Relationship to Other Protocols

2.1.5.5WMSP: Logical Dependencies and Relationship to Other Protocols

2.1.5.6WMHTTP: Logical Dependencies and Relationship to Other Protocols

2.2Protocol Summary

2.3Environment

2.3.1Authentication

2.3.2Media Player Client

2.3.3Encoder

2.3.4Dependencies on This System

2.3.5Dependencies on Other Systems/Components

2.4Assumptions and Preconditions

2.5Use Cases

2.5.1Publish Content to Media Server - Encoder

2.5.2Publish Secure Content to Media Server - Encoder

2.5.3Stream Content from Media Server - Media Player Client

2.5.4Request License from License Server - Media Player Client

2.5.5Log Statistics to Servers - Media Player Client

2.5.6Discover Media Server URLs - Media Player Client

2.6Versioning, Capability Negotiation, and Extensibility

2.7Error Handling

2.8Coherency Requirements

2.9Security

2.10Additional Considerations

3Examples

3.1Example 1: Encoder Push Content to Media Server

3.2Example 2: Media Server Pull Content from Encoder

3.3Example 3: Stream Content from Media Server to Media Player Client

3.3.1Stream Content Using RTSP-WME

3.3.2Stream Content Using WMSP

3.3.3Estimation of Packet-Pair Bandwidth

3.4Example 4: Publish Secure Content to Media Server

3.5Example 5: Log Statistics to Server

4Microsoft Implementations

4.1Product Behavior

5Change Tracking

6Index

1Introduction

The Media Streaming Server system is a platform for streaming audio and video content to clients over the Internet or an intranet. These clients can be other computers or devices that play back the content by using a media player, or they can be other computers running media servers that proxy, cache, or redistribute content.

The Media Streaming Server (MSS) system is designed to deliver an end-to-end experience for components that are involved in the creation, distribution, and playback of audio and video content. The system enables administrators and content providers to create media solutions for corporate communications, training and education, e-commerce, commercial broadcast, and other uses. The Media Streaming Server system consists of a computer running a media encoder, a server running as a media server, and a number of client computers running media player clients. The encoder converts both live and prerecorded audio and video content to a media format. The server then distributes the content over a network or the Internet. The media player client then receives the content. To scale and meet network demands, the system can also include cache and proxy servers, and distribution servers.

In e-commerce scenarios, the Media Streaming Server system can require the support of Digital Rights Management (DRM) components to enable the administrator to securely encrypt the broadcast and download of content.

Each of the components in the system uses the member protocols to enable scenarios that range from live broadcast playback to on-demand playback.

1.1Conceptual Overview

The Media Streaming Server (MSS) system includes protocols to transmit data packets that originate from downloadable and streaming audio, video, and other multimedia data files.

Concepts that are specific to the Media Streaming Server system are:

Digital Rights Management: Digital Rights Management (DRM) provides content providers with the means to protect their proprietary music or other data from illegal uses, such as the creation of unauthorized copies. DRM technology protects digital content by encrypting it and attaching to it usage rules that determine the conditions under which a user can play back the content. Usage rules typically limit the number of computers or devices that have access to the content, or limit the number of times that content can be played.

Encoder: An encoder is a tool that is used to capture audio and video files and streams, to digitize them, and to provide them to media servers for distribution. For more information on creating a broadcast, see [WM9CSEB] section 5.

The following figure illustrates the live broadcast configuration.

Figure 1: Live broadcast configuration

Encoders typically capture the video and audio streams from capture cards and recording devices. To capture from an analog source, such as a video tape, the computer requires a capture card that recognizes the analog stream and converts it to digital media information. The encoder then converts the digital media information to encoded media that can be efficiently transported as streaming media.

Live broadcast: A live broadcast is often used when viewers want to see and hear an important event as it is occurring. For information about on-demand versus live broadcasts, see [MSFT-WMSDG], "Distributing content".

Media player client: A media player client is usually the destination point in the Media Streaming Server system, and is typically designed for rendering the media streams.

On-demand broadcast: An on-demand broadcast is a re-broadcast of a live event or of any media file that is not time critical. In this case, users can request the stream when they want to watch it and can control the playback to meet their requirements. For information about on-demand versus live broadcasts, see [MSFT-WMSDG], Windows Media Services 2008 Deployment Guide.

Playlist: A playlist is a file or collection of content files that is designed to play in a specific order or a query that results in a list of content files that are designed to play in a specific order.

The following table describes various playlists that are used by the MSS system.

Playlist formats / Details
Media playlist files / Designed for audio-only files and often referred to as the MP3 playlist. The playlist can provide URLs to HTTP servers or media servers.
Advanced Stream Redirector files / Advanced Stream Redirector files are based on the Extensible Markup Language (XML) syntax and are designed specifically to provide URLs to content from media servers to the client. For more information on Advanced Stream Redirector files, see [MSDN-ASX].
Windows Media Player playlist / A playlist query that is based on Synchronized Multimedia Integration Language (SMIL) that only works on local content and is not used by MSS.
Windows Media Player Playlist syntax is based on SMIL 2.0. Clients load the playlists and process them locally. The playlist provides URLs or paths to the files. The client then streams the individual content by using the MSS system protocols.
See the W3C website [W3C] for the SMIL 2.0 specification. For more information on the Windows Media Player playlist syntax, see [MSDN-MediaPlaylists].
Server-side playlists / Server-side playlists are a query method that is used to generate a list of content to be streamed to the client. The server processes the playlist locally and then streams the content to the client by using the MSS system protocols. The client receives a new Advanced Systems Format (ASF) file header each time when the server transitions from one entry to the next in the server-side playlist. Server-side playlists are beneficial as they allow the server (or encoder) operator to inject new playlist entries, such as advertisements, into a live program.

Origin server: An origin server is a media server that publishes on-demand or live content.

Distribution server: A distribution server improves the scalability of the Media Streaming Server system. A distribution server publishes content that it received from another media server. The distribution server has to be networked to the origin server and have permission to stream from the origin server.

A distribution server publishes content that it received from another streaming source, such as another media server. The origin server is the source of the content that is being streamed by the distribution server. Clients then connect to the distribution server as if it were the origin server. Distribution servers are located between the origin server and the client in the content stream and therefore can perform load balancing. Distribution servers provide an easy way to reduce the client load on a media server because the client content requests are distributed to several servers on the network. Publishing via distribution servers is illustrated in the following figure.

Figure 2: Publishing via distribution servers

Proxy server: A proxy server is a dedicated computer that proxies data between the media player client and the server. If the server is acting as a caching server, the proxy server requests a stream from the origin server and allows multiple clients to stream the content. Therefore the origin server is limited to one network request. If the content is broadcast content, the content cannot be cached. In this case, the proxy server can create a split stream for the content. The proxy server receiving the stream from the origin server splits the stream to distribute to multiple clients simultaneously without increasing the requests to the origin server. Proxy servers fall into three categories:

Forward proxy server: The forward proxy server can retrieve information from another server on behalf of a client. Typically, a client is explicitly configured to use a specific proxy server, and when the client requests content, the proxy server connects to an origin server to retrieve the content.

Reverse proxy server: A reverse proxy server is a proxy server that is configured to service all client requests. For unicast broadcasts, a reverse proxy server can reduce the load on the origin server by streaming multiple unicast streams while receiving only one stream from the origin server. For on-demand content, a reverse proxy server can reduce the load on the origin server by caching the content from the origin server and streaming it to clients from its cache.

To the client, the reverse proxy server appears to be the origin server. This structure enables the origin server to be isolated from the clients. A reverse proxy server can increase the security of the streaming media system because the client never connects to the origin server directly.

Transparent proxy server: A transparent proxy server is a server that transmits data between the server and the client without any modification of the data. It is a forwarding service that the client is unaware of.

Packet-pair bandwidth estimation: Packet-pair bandwidth estimation is a technique that is used to estimate the bandwidth of a streaming media connection over the Internet.

To estimate bandwidth, the server sends two or more consecutive packets of highly entropic data. The client estimates the bandwidth by measuring the difference between the times that it receives the packets. This method is usually reliable; however, if the client traverses a Network Address Translation (NAT) firewall or proxy server, the packet-pair bandwidth measurement might be inaccurate. Packet-pair bandwidth estimation is supported by the following protocols: the Real-Time Streaming Protocol (RTSP) Windows Media Extensions (RTSP-WME), the Windows Media Server (MMSP) Protocol, and the Windows Media Streaming HTTP Protocol (WMSP), as illustrated in the following figure.

Figure 3: Packet-pair bandwidth estimation

Fast start: Allows the media player to buffer at speeds higher than the bit rate of the content requested. This enables users to start receiving content more quickly. After the initial buffer requirement is fulfilled, on-demand and broadcast content streams at the bit rate are defined by the content stream.

Fast start also allows a distribution server to request the data from the origin server at a faster bit rate. The bit rate that is specified in the fast start protocol headers ensures that the distribution server has enough data buffered to meet its requirements and the requirements of the media player client. To enable fast start, the protocols use the following two headers or tokens to request fast start:

Accelerate headers: The tokens that the client uses to request a higher transmission rate and duration from the server.

Burst headers:The tokens that the client uses to request a higher transmission rate and duration from the server. The client that sends the request is usually an intermediate device that is relaying the request for another client.

Fast start is supported only by the Windows Media HTTP Streaming Protocol (WMSP) and the Real-Time Streaming Protocol (RTSP) Windows Media Extensions (RTSP-WME).

Advanced fast start:Advanced fast start is designed to minimize startup latency in the media player client. Startup latency is the period of time starting when a viewer requests a stream by using the player and ending when the content begins playing. The primary reason for startup latency is the delay caused by buffering on the media player client.

Advanced fast start enables the media player client to begin playing a stream before its buffer is full. As soon as the media player client receives a minimum amount of data, it can begin playing a stream while its buffer continues to fill at an accelerated rate—a rate that is faster than the encoded bit rate of the content. When the buffer is full, acceleration stops, and the media player client begins receiving data at the encoded bit rate.

For advanced fast start to work effectively, adequate bandwidth has to be available above the encoded bit rate of a stream. For example, if 1,200 kilobits per second (Kbps) of bandwidth is available for an 800 Kbps stream, the media player client can use an acceleration rate of 1.5 times the encoded bit rate. If no additional bandwidth is available, the player fills its buffer before it begins playing a stream, and no benefit can be gained from either advanced fast start or fast start.

Advanced fast start is used only by clients that connect to a unicast stream and is supported only by the Windows Media HTTP Streaming Protocol (WMSP) and the Real-Time Streaming Protocol (RTSP) Windows Media Extensions (RTSP-WME).

Unicast streaming: Unicast streaming is a one-to-one connection between the media server and a media player client, which means that each client receives a distinct stream. Only those clients that request the stream receive it. The server can deliver content as a unicast stream from either an on-demand or a broadcast publishing point. Unicast streaming offers the benefits of interactivity between the player and server. However, the number of users that can receive unicast streams is limited by the bit rate of the content, the speed of the server network, and the available server resources. The number of users that are served is directly proportional to the amount of available server resources and instances. Unicast streaming is illustrated in the following figure.