IEOR 170: Interaction and Experience Design for Engineers
Lecture- 04.16.2003
Text, Image, and Sound Compression
Speaker: Anthony Levandowski
TA for IEOR 170
Email:
Introduction to Formats
-There are many ways information can be obtained. It can be through texts/data, images, sound, movies, smell, taste, force, texture, and etc.
-Storing files can either be in binary (1, 0) or in plain text formats. Propriety types on the other hand are mixture of both plain text and binary formats.
Example of ‘propriety types’
-MS Word – 75% text, 25% binary
-PDF – 99% binary (file modification is not allowed)
-Windows Media Player
Bytes vs. bits
- 1 byte = 8 bits
- Bit is a 0 or 1 (off or on voltage)
3 steps in file compression (how we store/transmit files):
- Analyze files
- Look for redundancy or pattern
the more repetitive the pattern the better
the larger the pattern the better
- Create (LZ) dictionary and replace
use the pattern found in 2 as a dictionary
substitute the pattern with dictionary index
Example of file compression
1. Analyze files
"Ask not what your country can do for you -- ask what you can do for your country.“
–17 words
–61 letters
–16 spaces
–1 dash
–1 period
2. Look for redundancy or pattern
"ask" appears two times
"what" appears two times
"your" appears two times
"country" appears two times
"can" appears two times
"do" appears two times
"for" appears two times
"you" appear two times
3. Create LZ dictionary and replacement
1.ask
2.what
3.your
4.country
5.can
6.do
7.for
8.you
"1 not 2 3 4 5 6 7 8 -- 1 2 8 5 6 7 3 4."
"Ask not what your country can do for you -- ask what you can do for your country.“
Example using patterns:
‘ou’ appears in “your” and “country”
"can do for" is also repeated
"your country" vs. "r country," and "you,"
tradeoffs dictionary length vs. document length. Case for replacing “ou”
- ask_
- what_
- you
- r_country
- _can_do_for_you
Thus, the compression is now smaller.
Question: How well does compression work on plain text files?
Answer: 8KB 4KB (a 50% reduction); typically a 10-20x reduction on large files
Log files: +100x, why?
Question: What about on PDF? How well does compression work on PDF files?
Answer: 16KB 12.1KB; typically 2-5x reduction on large files.
Audience’s Question: What are the most popular words will first be taught in ESL school?
Answer: basic words such as a, the, etc (a handful of words that fall on the very elft side of the graph)
-Dr. Seuss books are based off these words
Type of Images
- Vector – an image defined by mathematical lines (i.e. set of lines, points, curves, equations). Adobe Illustrator has vector format.
- Raster – an image made up by small dots, known as pixels in different colors (i.e. matrix, pixels, grid values). If we scale grid, image gets bad (i.e. Photoshop). We can’t precisely make image with different shades and colors.
Note: There are many types of image formats from .art to .wpg
Image Comparison:
•lossy vs. lossless
•.gif image compression (compuserve) 1980’s
–color and run length
–good for “flat”images / rectangular images / images with little color change
•.jpg image compression
(Joint Photographic Expert Group) 1980’s & 1990’s
–extract an 8x8 pixel block from the picture
–calculate the discrete cosine transform for each element in the block
–a quantizer rounds off the discrete cosine transform (DCT) coefficients according to the specified image quality (this phase is where most of the original image information is lost, thus it is dubbed the lossy phase of the JPEG algorithm)
–the coefficients are compressed using an encoding scheme such as Huffman coding or arithmetic coding
- good for pictures
SOUND
Streaming
- Example: WMA
Music is sampled 44,100 times per second. The samples are 2 bytes (16 bits) long.
Separate samples are taken for the left and right speakers in a stereo system
caped bit-rate
- 128kbps (CD quality)
- 64kbps (tape quality)
- 16kbps (low quality)
Non-streaming
Example: - Wav
- MP3
-optimize compression algorithm (i.e. gets rid of sounds you can’t hear)
MP3
MP3 compression (Moving Picture Experts Group audio Layer-3)
-Gets rid of sounds ear cannot hear
- 44,100 samples/second
x 16 bits/sample
x 2 channels
= 1,411,200 bits per second
-32 MB per 3 min
Reduce to 3MB per 3 min by tinkering with the facts that:
There are certain sounds that the human ear cannot hear.
There are certain sounds that the human ear hears much better than others.
If there are two sounds playing simultaneously, we hear the louder one but cannot hear the softer one
How does the ear translate Sound:
-Sound waves work by propagation through a medium such as air;
1) Sound waves go through outer ear (ear lobe, canal, etc. but this only amplifies a small amount like 4x)
2) Middle Ear (ear drum, bones, hammer, anvil, stirrup) much amplification occurs; the ear drum is vibrated by sound waves
3) Inner Ear (see pics below): hair cells vibrate and transmit signals to brain
Note: The exact process sound is heard is still unknown