EE 5359

MULTIMEDIA PROCESSING

INTERIM REPORT

SPRING 2016

STUDY AND PERFORMANCE ANALYSIS OF HEVC, H.264/AVC AND DIRAC

BY

ASHRITA MANDALAPU

1001096980

With Guidance by,

Dr.K.R.RAO

ACRONYMS

AVC: Advanced Video Coding.

BBC: British Broadcasting Corporation

BD-BR: Bjontegaard Delta Bit rate.

BD-PSNR: Bjontegaard Delta Peak Signal to Noise Ratio.

CIF: Common Intermediate Format

CTB: Coding Tree Block.

CTU: Coding Tree Unit.

CU: Coding Unit.

EBU: European Broadcasting Unit

fps: Frames per second

HD: High Definition

HDTV: High Definition Television

HEVC: High Efficiency Video Coding.

HM: HEVC Test Model.

ICME: International Conference on Multimedia and Expo

IEC: International Electro-technical Commission.

ISDB: Integrated Services Digital Broadcasting

ISO: International Organization for Standardization.

ITU-T: International Telecommunication Union- Telecommunication Standardization Sector.

JCT: Joint Collaborative Team.

JCT-VC: Joint Collaborative Team on Video Coding.

JM: H.264 Test Model.

JPEG: Joint Photographic Experts Group.

MC: Motion Compensation.

ME: Motion Estimation.

MPEG: Moving Picture Experts Group.

MSE: Mean Square Error.

MSU: Moscow State University

PB: Prediction Block.

PSNR: Peak Signal to Noise Ratio.

QCIF: Quarter Common Intermediate Format

QF: Quality Factor

QP: Quantization Parameter

RTP: Real-time Transport Protocol

SSIM: Structural Similarity Index.

TB: Transform Block.

TU: Transform Unit.

VCEG: Visual Coding Experts Group.

VQMT: Video Quality Measurement Tool

OBJECTIVE

The objective of this project is to study, implement and compare the video coding standards of HEVC [1],H.264/AVC [2] and Dirac [3]. The analysis is performed in terms of complexity, video quality, bit rates, compression ratio using different performance metrics like MSE, PSNR, BD-BR, BD-PSNR and SSIM [13] [14]. The advantages and disadvantages for the baseline profiles of the above mentioned video coding standards are studied for understanding of the video coding standards.

INTRODUCTION

ColorSpaces

Thecommoncolor spacesfor digital imageandvideorepresentationare:

  • RGB Color space – Each pixel is represented by three numbers indicating the relative proportionsof red, green andbluecolor.
  • Y Color space – Y is the luminance component, a monochrome version of color image. Y is a weighted average of R, G and B: YkrRkgGkbB,wherekvalues aretheweightingfactors.Thecolorinformationisrepresentedascolordifferencesorchrominancecomponents,where each chrominancecomponent isdifferencebetweenR, Gor BandtheluminanceY.

As the human visual system is less sensitive to color than the luminance component, Yhasadvantages over RGB space. The amount of data required to represent the chrominance component reduces without impairing the visual quality [22].

Thepopularpatternsofsub-sampling[22] are:

  • 4:4:4 – The three components, Y:: have the same resolution, which is for every 4 luminance samples there are 4 and 4 samples.
  • 4:2:2 – For every 4 luminance samples in the horizontal direction, there are 2 and 2 samples. This representation is used for high quality video color reproduction.
  • 4:2:0 – and each have half the horizontal and vertical resolution of Y. This is popularly used in applications such as video conferencing, digital television and DVD storage.

Figure1. Sub-sampling pattern [22]

Figure2. 4:2:2 and 4:4:4 sub-sampling patterns [22]

CIF and QCIF Formats:

Common Intermediate Format (CIF) and Quadrature Common Intermediate Format (QCIF) determine the resolution of the frame. The resolution of CIF is 352x288 and the resolution of QCIF is 1/4 of CIF, which is 176x144 [7].

Consider the Y family of color spaces where Y represents the luminance, represents the blue-difference chroma component and represents the red-difference chroma component [7].

For QCIF and CIF, the luminance Y is equal to the resolution. If sampling resolution 4:2:0 is used, for CIF, the and are 176 x 144 lines and for QCIF, the and are 88 x 72 lines.

HEVC

High Efficiency Video Coding (HEVC) [1] is an international standard for video compression developed by a working group of ISO/IEC MPEG (Moving Picture Experts Group) and ITU-T VCEG (Video Coding Experts Group). HEVC is a successor to the H.264/AVC video coding standard [1]. One of its primary objectives is to provide approximately two times the compression efficiency of its predecessor without any detectable loss in visual quality [1]. HEVC thus achieves 2x higher compression compared to H.264/AVC. HEVC provides high throughput (Ultra-HD 8K @ 120fps) andlow power. HEVC features friendly implementations such as built-in parallelism.

HEVC will provide a flexible, reliable and robust solution to support the next decade of video. HEVC benefits include

  • Reduce the burden on global networks
  • Easier streaming of HD video to mobile devices
  • Account for advancing screen resolutions (e.g. Ultra-HD)

HEVC is designed to address existing applications of H.264/MPEG-4 AVC and to focus on two key issues: increased video resolution and increased use of parallel processing architectures [6]. It primarily targets consumer applications as pixel formats are limited to 4:2:0 8-bit and 4:2:0 10-bit. The new revised standard enables new use-cases with the support of additional pixel formats such as 4:2:2 and 4:4:4, embedded bit-stream scalability and 3D video [4].

Figure3. Block Diagram of HEVC [17]

Encoder and Decoder:

Source video, consisting of a sequence of video frames, is encoded or compressed by a video encoder shown in figure 4 to create a compressed video bit stream. The compressed bit stream is stored or transmitted. A video decoder as shown in figure 5 decompresses the bit stream to create a sequence of decoded frames [17].

The video encoder performs the following steps:

  • Partitioning each picture into multiple units
  • Predicting each unit using inter or intra prediction, and subtracting the prediction from the unit
  • Transforming and quantizing the residual (the difference between the original picture unit and the prediction)
  • Entropy encoding transform output, prediction information, mode information and headers

The video decoder performs the following steps:

  • Entropy decoding and extracting the elements of the coded sequence
  • Rescaling and inverting the transform stage
  • Predicting each unit and adding the prediction to the output of the inverse transform
  • Reconstructing a decoded video image

Figure4. Block diagram of HEVC encoder with built-in decoder (gray shaded region) [5].

H.264/AVC

H.264/AVC is an efficient video compression technique available today, developed as a result of the collaboration between the ISO/IEC Moving Picture Experts Group and the ITU-T Video Coding Experts Group.

It is the most widely used video coding standard [18] for streaming videos, mobile/handheld applications, HDTV broadcasting etc. The H.264 standard supports three sampling patterns for luminance component (Y), red-difference chroma component () and blue-difference chroma component () [20]. The 4:4:4 sampling means that the three components (Y: ) have the same resolution and a sample of each component exists at every pixel.

Encoder and Decoder:

An H.264 encoder converts the raw video into a compressed version and the decoder converts the compressed video back to its original format.The H.264 encoder block diagram is shown in Figure 5.

Figure5. Block diagram of AVC encoder [2]

The encoder performs transform, quantization, prediction and encoding to produce compressed video. The decoder shown in Figure 6 on the other hand does the inverse operations to obtain the uncompressed video.

Figure6. Block diagram of AVC decoder [7]

DIRAC

Dirac is a video compression system [9] developed by the British Broadcasting Corporation (BBC) utilizing motion compensation and wavelet transformsto provide high-quality video compression for web streaming and HDTV applications [3].Dirac can compress any size of picture from low resolution QCIF(180 x 144) to HDTV(1920 x 1080).

There are two parts in the Dirac development process: (i) a compression specification for the bit stream and decoder, and (ii) software for compression and decompression. The software is not intended simply to provide reference coding and decoding - It is a prototype implementation that can freely be modified and deployed. The decoder in particular is designed to be fast and more agile. The resulting specification is simple and straightforward to implement and optimize for real-time performance [9].

Encoder and Decoder:

In the Dirac codec, image motion is tracked and the motion information is used to make a prediction of a later frame. A wavelet transform is applied to the prediction error between the current frame and the previous frame aided by motion compensation and the transform coefficients are quantized and entropy coded.Temporal and spatial redundancies are removed by motion estimation, motion compensation and discrete wavelet transform respectively. Dirac uses a more flexible and efficient form of entropy coding called arithmetic coding which packs the bits efficiently into the bit stream [11]. The encoder has the architecture as shown in Figure 7, while the decoder performs the inverse operations as shown in Figure 8.

Dirac can be fully implemented in C++ programming language which allows object oriented development on all common operating systems. The C++ code compiles to produce libraries for common functions, motion estimation, encoding and decoding, which have an interface that allows them to be called from C. An application programmer’s interface can be written in C so that it can be kept simple and integrated with various media players, video processing tools and streaming software. [2]

Figure7. Block diagram of Dirac encoder [9]

Figure8. Block diagram of Dirac decoder [9]

COMPARISON METRICS

Structural Similarity Index Metric (SSIM) [14]: This index is a method for measuring the similarity between two frames. It is a full reference metric, or in other words, the measuring of image quality is done using an initial uncompressed or distortion-free frame as reference.

Mean Squared Error (MSE) [14]: The MSE is computed by averaging the squared intensity differences of the distorted and reference image/frame pixels. Two distorted images with the same MSE may have very different types of errors, some of which are much more visible than others.Given a noise-free m x n monochrome image I and its noisy approximation K, MSE is defined as:

Peak Signal-to-Noise Ratio (PSNR) [14]: The PSNR is most commonly used as a measure of quality of reconstruction of compression codecs. The signal in this case is the original data, and the noise is the error introduced by compression. PSNR is defined as:

where is the maximum possible pixel value of the image.

For test sequences in 4:2:0 color format, PSNR is computed as a weighted average of luminance (PSNR-Y) and chrominance (PSNR-U, PSNR-V) components [23] as given below:

Bjontegaard Delta Bitrate (BD-BR) and Bjontegaard Delta PSNR (BD-PSNR):To objectively evaluate the coding efficiency of video codecs, Bjontegaard Delta PSNR (BD-PSNR) wasproposed. Based on the rate-distortion (R-D) curve fitting, BD-PSNR provides a good evaluation of the R-D performance [24] [25].

BD metrics allow to compute the average gain in PSNR or the average per cent saving in bitrate between two rate-distortion curves. However, BD-PSNR has a critical drawback: It does not take the coding complexity into account [24].

PERFORMANCE ANALYSIS

Performance analysis in this project can be done using the following:

Profiles used: HM 16.8 [15], JM 19.0 [16] and Dirac 0.2.0 [12]

Test Sequences: CIF and QCIF formats [19]

Quality Metrics: MSE, PSNR, SSIM, BD-BR and BD-PSNR [13] [14]

Measurement tool: MSU Video Quality Measurement Tool (VQMT) [21]

HM 16.8

Test Sequence 1: container_cif.yuv

Width: 352; Height: 288

Original frame

Compressed frames:

QP=0

QP=30

QP=50

QP / Bitrate (kbps) / Y-PSNR (dB) / U-PSNR (dB) / V-PSNR (dB) / YUV-PSNR (dB) / Y-MSE / Y-SSIM / CR
0 / 10122.3240 / 60.0767 / 93.0513 / 65.4934 / 63.3563 / 0.02388 / 0.99982 / 3:1
10 / 2173.6872 / 48.2140 / 50.9818 / 51.6709 / 48.6494 / 1.08898 / 0.99190 / 3:1
20 / 457.4112 / 42.2188 / 47.2074 / 47.4331 / 43.3155 / 3.94241 / 0.97444 / 3:1
30 / 116.9040 / 36.2546 / 42.5361 / 42.5301 / 37.5203 / 15.44748 / 0.92480 / 3:1
40 / 32.9400 / 30.2406 / 38.8680 / 38.7084 / 31.7064 / 61.54316 / 0.84493 / 3:1
50 / 9.3672 / 24.4345 / 35.4228 / 34.7057 / 26.0082 / 234.35745 / 0.71897 / 3:1

Table1. Implementation results table of container_cif.yuv sequence

Test Sequence 2: container_qcif.yuv

Width: 176; Height: 144

Original frame

Compressed frames:

QP=0

QP=30

QP=50

QP / Bitrate (kbps) / Y-PSNR (dB) / U-PSNR (dB) / V-PSNR (dB) / YUV-PSNR (dB) / Y-MSE / Y-SSIM / CR
0 / 1871.6040 / 112.9769 / 118.9392 / 119.0284 / 62.0103 / 0.03071 / 0.99978 / 3:1
10 / 354.2736 / 49.1715 / 51.9854 / 52.4211 / 49.8180 / 0.82036 / 0.99440 / 3:1
20 / 99.9720 / 43.0272 / 46.9816 / 47.1812 / 43.9883 / 3.25569 / 0.98036 / 3:1
30 / 31.8768 / 36.4655 / 41.5515 / 41.0865 / 37.5537 / 14.72613 / 0.94231 / 3:1
40 / 12.2376 / 29.6376 / 37.8841 / 37.5096 / 31.0603 / 70.86666 / 0.88148 / 3:1
50 / 4.6392 / 23.4375 / 33.6967 / 32.8722 / 24.9746 / 294.86441 / 0.71000 / 3:1

Table2. Implementation results table of container_qcif.yuv sequence

REFERENCES

[1] G.J. Sullivan et al, “Overview of the high efficiency video coding (HEVC) standard”, IEEE Trans. CSVT, vol. 22, pp.1649-1668, Dec. 2012.

[2] T. Wiegand, G. Sullivan, G. Bjontegaard and A. Luthra, “Overview of the H.264/AVC video coding standard,”IEEE Trans. on Circuits and Systems for Video Technology, vol. 13, pp.560-576, July 2003.

[3] “The Dirac web page”:

[4] G.J. Sullivan et al, “Standardized Extensions of High Efficiency Video Coding (HEVC)”, IEEE Journal of selected topics in Signal Processing, Vol. 7, No. 6, pp. 1001-1016, Dec. 2013.

[5] V. Sze, M. Budagavi and G.J. Sullivan (Editors), “High efficiency video coding: Algorithms and architectures”, Springer 2014.

[6] S.Wenger et al,"RFC 3984: RTP Payload Format for H.264 Video". p.2.

[7] S.K.Kwon, A. Tamhankar and K.R.Rao, “Overview of H.264/MPEG-4 Part 10” J.VCIR, Vol. 17, pp. 186-216, April 2006, Special Issue on “Emerging H.264/AVC video coding standard”.

[8] K. R. Rao and D. N. Kim, “Current Video Coding Standards: H.264/AVC, Dirac, AVS China and VC-1,” IEEE 42nd Southeastern symposium onsystem theory (SSST), March 7-9 2010, pp.1-8, March 2010.

[9] T. Borer, and T. Davies, “Dirac video compression using open technology”, BBC EBUTechnical Review, July 2005

[10] A. Ravi, and K.R. Rao, “Performance Analysis and Comparison of the Dirac video codec with H.264/MPEG-4 part 10 AVC”, International Journal of Wavelets, Multiresolution and Information Processing (accepted), January 2010. Available:

[11] BBC Research on Dirac:

[12] Dirac software:

[13] Z. Wang and A.C. Bovik, “A universal image quality index”, IEEE Signal Processing

Letters, Vol.9, pp. 81-84, March 2002.

[14] Z. Wang, et al, “Image Quality Assessment: From Error Visibility to Structural Similarity”, IEEE Transactions on Image Processing, vol.13, no.4, pp. 600-612, April 2004.

[15] HEVC software:

[16] AVC software:

[17] “Video coding for low bit rate communications,” ITUT, ITU-T Recommendation H.263, ver. 1, 1995.

[18] T. Wiegand and G.J. Sullivan, “The picturephone is here. Really,” IEEE Spectrum, vol. 48, pp. 50-54, Sept. 2011.

[19] YUV video sequences:

[20] I.E.G Richardson, “The H.264 advanced video coding standard”, Second Edition, Wiley, 2010

[21] MSU tool:

[22] I.E.G. Richardson, “Video Codec Design: Developing Image and Video Compression Systems”, Wiley, 2002.

[23] J. Vanne et al, “Comparative Rate-Distortion-Complexity Analysis of HEVC and AVC Video Codecs”, IEEE Transactions on Circuits and Systems for Video Technology, Vol. 22, No. 12, pp. 1885-1898, Dec. 2012.

[24] X. Li et al, “Rate-complexity-distortion evaluation for hybrid video coding”, IEEE International Conference on Multimedia and Expo (ICME), pp. 685-690, July 2010.

[25] G. Bjontegaard, “Calculation of Average PSNR Differences between RD Curves”, document VCEGM33, ITU-T SG 16/Q 6, Austin, TX, Apr. 2001.

[26] HEVC Software Reference Manual:

[27] Tutorial: D. Grois, et al, “HEVC/H.265 Video Coding Standard (Version 2) including the Range Extensions, Scalable Extensions, and Multiview Extensions,” (Tutorial) Sunday 27 Sept 2015, 9:00 am to 12:30 pm), IEEE ICIP, Quebec City, Canada, 27 – 30 Sept. 2015.

The tutorial below is for personal use only.

Password: a2FazmgNK

[28] M. Wien, “HEVC – coding tools and specifications”, Tutorial, IEEE ICME, San Jose, CA, July 2013