QoS Requirements of Multimedia Applications
Brett Berliner, Brian Clark and Albert Hartono
Department of Computer Science and Engineering
The Ohio State University
Columbus, OH 43210
{berliner, clarkbr, hartonoa}@cse.ohio-state.edu
Abstract
Quality of Service requirements are very important to multimedia applications. Ensuring that these requirements are met is key to many of today’s applications and creating new technologies to ensure that stricter requirements can be met will help create new devices in the future. This is a study on the values of the QoS Requirements for Multimedia Applications.
Introduction
Figure 2: Delay Contribution by Components [2]Delay Component / Maximum Delay Contribution (ms) / Comments
Packetization / 30 / The process of converting the actual digital signal into packets.
Serialization / 40 / This delay is only incurred when using modems.
Dejittering / 30 / Done to compensate for jitter introduced by networks. This assumes a 1x dejitter buffer is used.
Presentation / 17 / The delay introduced by actually presenting the information to the human recipient.
Figure 1: Audio Compression Standards (Codecs) [1]
Codec Name / Sampling Rate
(kHz) / Bit Rate (Kbps) / Delay Contribution
(ms) / Miscellaneous
G.711 / 8 / 64 / < 1 / -
GIPS Enhanced G.711 / 8 / Variable / - / Voice activity detection
G.723.1 / 8 / 5.3 and 6.3 / 100 / -
G.728 / 8 / 16 / 2 / Optimized for low delay
G.729 / 8 / 8 / 10 / Voice activity detection
When the internet was designed, it was intended to be used for transfer of text or other simple data types, where the level of service did not matter. The only thing that mattered was reliability. Concepts such as delay, jitter and packet loss percentages did not effect the service, the only thing that mattered was that the service existed. When the internet started being used for applications such as internet telephony, streaming live video and even remote surgery, things like jitter and packet loss began to matter. Multimedia, unlike text, has a need for service guarantees or else the services become useless. Trying to carry on a conversation when words arrive out of order would be quite frustrating. Thus, the introduction of multimedia into the internet led to the concept of quality of service.
1. Voice
With the growth of the internet and easier access to high speed internet connections, more and more people are turning towards computer networks to handle their long distance voice communication instead of the traditional telephone system. Using the internet to replace standard telephone lines has many advantages. One of the biggest advantages being that using the internet for voice communication eliminates the concept of “long distance”. Most companies that provide internet based voice communication charge a monthly rate and do not charge on a per-minute basis like traditional telephone companies. The most common way of transmitting voice over the internet is by Voice over Internet Protocol, or VoIP.
1.1 Raw Data for VoIP
The process of sending human voice over a computer network starts with a person speaking into a PC microphone. The sounds waves produced by the voice must be translated into an electrical signal in order to be sent over a network. This process of converting the analog signal to a digital one is call digitization. In order to digitize human voice effectively a sample is captured 8,000 times per second, or given a sampling rate of 8 kHz. It is standard to use 8 bits per sample which results in a minimum data transfer rate of 64 Kbps. At the application layer this digital signal is encoded and decoded by a codec. The sampling rates, bit rates and extra information about popular codecs can be found in Figure 1. Like any other calculation, encoding and decoding voice signals takes some finite amount of time. The delay contributions of the various codecs are also presented in Figure 1. Each of these compression codecs introduces a different amount of delay. The delay introduced comes from various sources. The upper bounds of some of the factors contributing to this delay are presented in Figure 2[3].
Once the voice is encoded with a particular codec it is transmitted over the internet using internet protocol l(IP).
Since IP is a best-effort service the QoS is not perfect and some delay, loss and jitter is encountered. When the encoded signal reaches its intended destination it is decoded using the same codec used by the receiver. Finally the decoded audio signal is presented to the receiver through a speaker.
1.2 Need for Delay Reduction
From Figures 1 and 2, one can see that the delay introduced from a codec alone can approach almost 120 ms. This number does not include the other various delays introduced by the network such as propagation delay, queuing delay and transmission delay. Based on the ITU recommendation G.114, the delay in a telephone call should be less than 100-150 ms. The reasoning behind this is a psychological factor. If the delay is much more than this the caller will be dissatisfied with the service. Even though a delay of 100-150 ms is acceptable most QoS requirements for VoIP ask for 50-80 ms of delay or less.
1.3 Solutions
Figure 3: Required Compression Ratios for Package Television [4]NTSC TV / HDTV / Film Quality
Channel / Bit Rate / 168 Mb/s / 933 Mb/s / 2300 Mb/s
PC local LAN / 30 kb/s / 5,600:1 / 31,000:1 / 76,000:1
Modems / 56 kb/s / 3,000:1 / 17,000:1 / 41,000:1
ISDN / 64 – 144 kb/s / 1,166:1 / 6,400:1 / 16,000:1
T-1, DSL / 1.5 Mb/s / 112:1 / 622:1 / 1,500:1
Ethernet / 10 Mb/s / 17:1 / 93:1 / 230:1
T-3 / 42 Mb/s / 4:1 / 22:1 / 54:1
Fiber Optic / 200 Mb/s / 1:1 / 5:1 / 11:1
Since delay must be minimized to ensure satisfactory telephone service we must employ some techniques to reduce this delay. The first major way to speed up voice communication is to compress the audio signal. If the sheer size of the data being transported is reduced it will arrive at the destination quicker. Some notable low bit rate compression algorithms used are ITU G.723.1 and G.729A. Another way to reduce the payload of transmitting voice over IP is to use silence suppression. Due to the fact that during normal telephone conversation one person talks while the other listens, only 50% of the full duplex connection is used at a time. Also, voice packets are not transmitted during the silence observed in between words. By not sending packets containing “dead air” approximately 10% of the bandwidth is reduced. These two techniques total up to a 60% reduction in bandwidth from silence suppression.
2. Video
Video traffic is being sent more and more often in today’s internet and will only increase in the future. Applications such as video conferencing are becoming business standards and many websites, CNN.com for example, offer videos on demand. Today many homes even have digital cable television service which transmits video information over a network. Since video imaging requires lots of data, compression and reservation protocols are going to become necessary to support the future of video in networking.
2.1 Raw Data
In order to achieve studio quality picture a video stream is broken up into 30 frames per second. Each of these frames contains 525 lines. In each of these frames the y value, or luminance, is sampled at 13.5 MHz and the two chrominance values, u and v, are sampled at 6.75 MHz. This total data rate comes out to (13.5 + 6.75 + 6.75) * 8 = 216 Mbps. Due to this extremely high bit rate, obviously compression techniques are required to transmit video over the internet. For different transmission lines different compression is required. If a channel supports higher bandwidth then less compression is needed. Conversely, if a channel has lower bandwidth then a higher compression ratio is necessary to view the data. This point is illustrated in Figure 3. It can be seen that for slower channels sending video, even of lower quality, is just not feasible due to the enormous compression ratios needed.
Video uses techniques to compress individual frames, like JPEG does, but also uses motion prediction to compress the data further. In fact, most of the time the bit rate required for video transmission is dependant solely on motion within the images. Factors such as screen size, resolution and scanning rates are almost irrelevant. Motion is defined in
Figure 4: Number of Television Channels for Various Averaged Motions Within the Images [4]Average Motion / Very Slow / Slow / Normal / Fast
Pixel Change Rate / 2 kp/s / 4 kp/s / 8 kp/s / 16 kp/s
Channel / Bit Rate / 12 kb/s / 24 kb/s / 48 kb/s / 96 kb/s
PC local LAN / 30 kb/s / 2.5 / 1 / 0 / 0
Modems / 56 kb/s / 4 / 2 / 1 / 0
ISDN / 64 – 144 kb/s / 12 / 6 / 3 / 1
T-1, DSL / 1.5 Mb/s / 125 / 62 / 31 / 15
Ethernet / 10 Mb/s / 833 / 416 / 208 / 104
T-3 / 42 Mb/s / 3500 / 1750 / 875 / 437
Fiber Optic / 200 Mb/s / 16,666 / 8,333 / 4,166 / 2,083
increments of 1k (1024) pixels/second. In normal television this translates to approximately one square inch of changed image per second. This change does not need to be in one contiguous block, it can be scattered throughout the entire image. Figure 4 illustrates this by showing the number of simultaneous channels various types of links can support for different rates of motion.
2.2 Delay Introduced By Compression
Every computation takes some time and compressing/decompressing video is no exception. More often than not, the latency introduced by this process is much greater than the latency introduced by digitization and
digital processing in uncompressed format. Since most video is very data intensive a high compression ratio is needed. The greater the compression ratio used, the greater the latency introduced. Typically the delay introduced by encoding and decoding in a distribution and/or broadcast
scenario is several seconds [5].
3. Interactive Gaming
Recently, interactive multimedia, such as network gaming, remote visualizations, remote surgery and tele-immersion, has become a very large part of the
still developing internet. Compared to video and voice, these types of applications often have QoS requirements that are even tougher to satisfy than video and voice program. This is often due to the fact that these applications can generally not afford to lose packets or suffer from any noticeable latency, or there is a good chance the experience will be affected, if not ruined.
Among researchers, there is a belief that the lower bound on the acceptable delay from interactive multimedia is 15 ms, which is the amount of time it takes for a 66Hz monitor to draw a single frame. With a lower delay requirement, the monitor could not keep up, and therefore, most of these methods could not be implemented [6]. Figure 5 shows the average delay requirements for interactive multimedia in comparison to those of video and voice.
3.1 Definition
One type of interactive multimedia is interactive gaming. Interactive gaming, in this case, refers to players on their own machine connecting remotely to other machines to compete in the same event against each other. The device used to connect could be a PC, a console game system or a handheld device. Each of the devices already has most of the game data, such as the engine and the graphics, so only certain data needs to be sent to the central server. This data may include character positioning and orientation, as well as their current action, and the central server sends the pertinent data to the connected computers for processing.
Figure 5: Delay Requirements for Data Types [6]Application / Video / Voice / Interactive Multimedia
Delay (ms) / 150 / 150+80 / 15
3.2 QoS Requirements
These requirements help ensure that gameplay is a smooth, realistic experience for all users with a minimum internet connection, depending on the game. Even the inability to meet one of these requirements often will completely ruin gamers’ experiences while playing. The QoS requirements that most directly affect interactive gaming are [7]:
1. a minimum amount of throughput
2. an acceptable end-to-end delay
3. low jitter
4. low packet loss rate
5. high dependability
3.3 Throughput
Throughput is a QoS requirement that varies from game to game. Most games only require 56K dial up connections (40 kpbs) to run smoothly. For example, two of today’s most popular online games, Guild Wars and Counter-Strike, can both be played online with a 56K connection. Counter-Strike, for example, only needs around 16 Kbps per connected user to avoid slowdown [7]. This number can vary greatly depending on the genre of game. Games where players have to take turns, such as Massive Multiplayer Online Role Playing Games (MMORPGs) like Everquest or World of Warcraft, can allow for slower links, as the data can update while the player is waiting their turn. As a result, these games often only require a 30-40 Kbps link. This also applies to real time strategy games such as Command and Conquer, where the player tells their units what to do, and while the unit is processing, the server can send receive data. These games generally hover around 20 – 30 Kbps, although the newer the game, the higher the link speed necessary. Very new first person shooters (FPS), such as Battlefield 1942, can be played with 16 players on a 40 Kpbs connection. However, to take full advantage of all of the vehicles and weapons, as well as allow all 64 possible players at a time, each user must have a broadband connection around 250 Kbps [8]. Its sequel, Battlefield 2, needs around that level and offers no guarantees for those with less speed. In fact, for highest performance with 64 players, a link of 2 Mbps is necessary. The following table shows what type of games need around how much speed.