Data Representation

ShawlandsAcademy

Higher Computing

Systems Unit

Higher SystemsData Representation

Topic 1

Data Representation

Representation of positive numbers in binary including place values and range up to and including 32 bits

Conversion from binary to decimal and vice versa

Description of the representation of negative numbers using two’s complement using examples of up to 8 bit numbers

Description of the relationship between the number of bits assigned to the mantissa/exponent and the range and precision of floating point numbers

Conversion to and from bit, byte, Kilobyte, Megabyte, Gigabyte, Terabyte. (Kb, Mb, Gb, Tb)

Description of Unicode and its advantages over ASCII

Description of the bit map method of graphic representation using examples of colour/greyscale bit maps

Description of the relationship of bit depth to the number of colours using examples up to and including 24 bit depth (true colour)

Description of the vector graphics method of graphic representation

Description of the relative advantages and disadvantages of bit mapped and vector graphics

Description of the relationship between the bit depth and file size

Explanation of the need for data compression using the storage of bit-map graphic files, as examples

ShawlandsAcademy Page -1-

Higher SystemsData Representation

Introduction

Computers are called two-state devices because all data is stored using two values. All the logic circuits used in digital computers are based upon two-state logic. That is, quantities can only take one of two values, typically 0 or 1. These quantities will be represented internally by voltages on lines, zero voltage representing 0 and the operating voltage of the device representing 1. The reason two-state logic is used is because it is easy and economic to produce such devices.

Measures:

A bit is a Binary digit a 0 or a 1.
8 bits make a byte
1024 bytes in 1 Kilobyte (1024 = 210)
1024 Kbytes in 1 Megabyte (220)
1024 Mbytes in 1 Gigabyte (230)
1024 Gbytes in 1 Terabyte.

To go from bits to bytes, divide by 8.

To go from Kbytes to Mbytes, divide by 1024, etc.

Numbers

We use the base 10 number system to represent whole numbers, integers and fractional numbers. This number system uses the 10 digits 0  9 to represent numbers. The value of a decimal digit is given by its position within the base 10 number system.

Example:

10 000 1 000 100 10 1

34 043 is 3 4 0 4 3

3 x 10 000 + 4 x 1 000 + 4 x 10 + 3 x 1

The binary number system

When numbers are represented electronically, the most convenient base is 2, where each column, reading from the right is a power of two. The base 2 number system uses 2 symbols, 0 and 1 to represent a value.

Example 1:

16 8 4 2 1

10110 is 1 0 1 1 0

1 x 16 + 1 x 4 + 1 x 2 = 22

Example 2:

1101100110011010 is

32768 / 16384 / 8192 / 4096 / 2048 / 1024 / 512 / 256 / 128 / 64 / 32 / 16 / 8 / 4 / 2 / 1
1 / 1 / 0 / 1 / 1 / 0 / 0 / 1 / 1 / 0 / 0 / 1 / 1 / 0 / 1 / 0

= 32768 + 16384+4096+2048+256+128+16+8+2 = 55 726

So, to convert a binary number to our numbers (denary, or base 10), you put the headings starting from the right : …. 32 16 8 4 2 1 above the number and add up the headings where there is a 1.

The binary number system has the huge advantage that only two symbols are required, 0 and 1. These can easily be represented in a computer system by a switch or transistor being on or off, or by a high or low voltage level. Imagine how difficult it would be to represent 10 discrete logic values for the base 10 number system.

You can easily store data in binary e.g. magnetic discs using N/S magnetism or CDs using pits and lands to reflect light.

Binary representation also simplifies the number of arithmetic rules that need to be applied in calculations. Binary arithmetic has fewer rules. You need 100 rules for adding our numbers, you just need to know 0+0, 0+1, 1+0 and 1+1 (=10) to add binary numbers.

So the advantages of binary are:

Simple arithmetic

Simple electronic circuits

Wide range of storage devices can use 2 values.

Another advantage is called ‘signal degradation’. If you used 0V for 0, 1V for 1 up to 9V for 9, voltages are never stable and if that 9V drops to 8.5 is it an 8 or a 9? With binary you can have a large difference between the values (e.g. 0V for 0 and 8V for 1)

Converting our numbers to binary:

There are two ways to convert decimal numbers into binary.

Method 1.

To convert 29 into binary: write down the binary headings (don’t go past 29), then work out which headings add up to 29:

16 8 4 2 1

To get 29 we need a 16, an 8 a 4 and a 1, so 29 =

1 1 1 0 1

Method 2.

This is guaranteed to work on any number and is useful for very large numbers. Here you continuously divide by 2, writing down the remainder each time until there is nothing left. The binary number is formed by reading the remainders up the way.

2 29

2 14 R 1

2 7 R 0

2 3 R 1

2 1 R 1= 1 1 1 0 1

0 R 1

Something to know about storing numbers on a computer is that a fixed number of bits is always used. Let’s say a computer uses 32 bits to store numbers, then

3 would be stored as: 00000000000000000000000000000011

324 564 046 stored as:00010011010110000111010001001110

The programming would be too difficult if variable length numbers were used, the computer wouldn’t know when a number ended! This means that there is a fixed range of values that can be used determined by how many bits you use.

The range of positive integers.

With 1 bit you can get 2 possible values: 0 or 1.

With 2 bits you can get 4 values: 00, 01, 10 and 11

With 3 bits you get 8 values: 000, 001, 010, 011, 100, 101, 110 and 111

1 bit 21values=2range0  1

2 bits 22values=4range0  3

3 bits23values=8 range0  7

………

8 bits28values=256 range0  255

………

n bits2nvalues=2nrange0  2n - 1

The range of positive numbers you can code is always one less because we start at 0.

For n bits the range is 0 to 2n - 1

You can add, subtract, multiply & divide in binary exactly the same as our numbers although you would find division difficult to get the hang of.

The only thing you need to remember is 1+ 1 = 10 (and 1+1+1 = 11). Also 10 – 1 = 1

Simple really!

So 1010 + 1011 = 1 0 1 0

+ 1 01 1 1

1 0 1 0 1

EXERCISE 1:

Convert these binary numbers to decimal:

a) 10101 b) 11001 c) 11100010 d) 10101010 e) 11110000

Convert these decimal numbers to binary:

a) 27 b) 37 c) 56 (use 8 bits) d) 97 (use 8 bits) e) 765 (use ÷ by 2 method)

Why are computers called two state devices?

Give two reasons why computers use binary.

What range of positive numbers can be stored using:

a) 1 byte b) 16 bits use powers of 2 for these answers: c) 20 bits d) 32 bits

Add these binary numbers:

a) 1 0 1 0b) 1 1 0 1 1 1c) 1 0 1 0 1 1 0 0

+ 1 1 +1 0 1 1 0 1+ 1 1 0 1 0 1 0 1

If you use a fixed number of bits to store numbers, what will happen if there is a carry at the end?

What number has been tattooed on this leg:

a)If you go ankle to knee?

b)If you go knee to ankle?

Negative numbers

An obvious way of getting computers to store –ves would be make the first bit a 1 for negative, 0 for positive. This is called sign and magnitude but unfortunately doesn’t work because adding gives the wrong answer and there is a +ve 0 and a –ve 0.

So two’s complement is used because there is only one 0 and arithmetic works correctly.

In actual fact the ALU in a processor can only carry out two operations:

Binary addition
Inverting (or flipping) bits changing 1 to 0 and vice versa.

Two’s complement allows the ALU to store negatives and to subtract using these two operations. Subtraction is just adding the negative ( i.e. 7 – 2 = 7 + (-2) )

To store a –ve in two’s complement:

Step 1: Write down the positive value in binary

Step 2: Flip the bits ( 1 becomes 0 and 0 becomes 1)

Step 3: Add 1

For two’s complement to work correctly you must use a fixed number of bits for each number, so add 0s to the front to get the required amount.

Example 1: What is -17 in two’s complement using 8 bits

Step 1: 17 =00010001

Step 2:flip:11101110

Step 3:add 1:11101111

So 11101111 is -17 in 8-bit two’s complement.

Example 2: What is -88 in 8-bit two’s complement?

Step 1:88 =01011000

Step 2:flip:10100111

Step 3:add 1:10101000

So 10101000 is -88 in two’s complement.

You must remember that positive numbers are still stored in ordinary binary. So you only need to do step 1 for +ve numbers.

Using two’s complement, -ve numbers will always start with a 1, +ves with a 0. Also two’s complement is its own inverse. So to convert back you flip and add 1.

So going from binary to our numbers:

Example 3: What is this two’s complement number? 10110001

We can see it is a –ve because it starts with 1.

So Flip : 01001110

Add 1: 01001111

Put the usual binary headings above this number and you get: 64 + 8 + 4 + 2 + 1 = 79

So the original number was –79

Example 4: What is the value of the two’s complement number11001111?

Flip: 00110000 Add 1: 00110001 work it out as 49, so answer -49.

N.B. If the number starts with a 0 it is positive, just put the headings above it to find out what it is.

EXERCISE 2:

Write these numbers in 8-bit two’s complement:

a) -34 b) -19 c) -97 d) -64 e) 28

These binary numbers are stored in 8-bit two’s complement, work out their value in denary.

a) 11000000 b) 10111111 c) 10011000 d) 01010001

Why do computers use two’s complement?

Work out these subtractions by i) subtracting them (remember 10 – 1 = 1 ) and ii) by adding the two’s complement.

a) 01001010 – 00000110 b) 01100111 – 00111111

(throw away the carry at the end when you add the two’s complement)

What range of values can be stored using 8 bits in two’s complement?

That covers how to store Integers. What about numbers outside of the range of integers? Also what about decimals?

For these numbers, computers use Floating Point. This is the same as Standard Form except the computer does not store the point or the base and counts the places from the BEGINNING OF THE NUMBER.

93 000 000 in Standard Form :9.3 x 107

93 000 000 in Floating Point : 93 8

The 93 is called the MANTISSA

The 8 is called the EXPONENT.

Again, computers use a fixed number of bits to store floating point numbers. For instance with 32 bits they might use 24 bits for the mantissa and 8 bits for the exponent. Or they could use 20 bits mantissa, 12 bits exponent. This has an effect of the ACCURACY that numbers are stored in and the RANGE of values that can be stored.

With our numbers imagine we have a calculator that can only store 3 digits for the mantissa and 1 digit for the exponent:

93 000 000 would be stored:

12 875 000 would be stored:

Note the loss of accuracy.

The length of the mantissa determines how accurately (or precisely) floating point numbers can be stored.

2 340 000 000 cannot be stored at all. The point moves 10 places and this is out of our range. So the exponent determines the range of numbers that can be stored in floating point.

The more bits for the mantissa, the higher the accuracy

The more bits for the exponent, the bigger the range.

And vice versa.

TEXT

ASCII

American Standard Code for Information Interchange (ASCII) was first developed for teletypewriters and is now an internationally agreed standard for storing information.

ASCII uses 7 bits per character, giving a possible 128 different characters. It has 96 displayable characters, enough to represent every letter, number and punctuation mark of the English alphabet which forms its character set. Each character has its own unique code.

There are 32 special character codes known as control characters. They make something happen like new line, clear screen etc.

Now computers use bytes (groups of 8 bits), so the extra bit can be used either for error checking or extending the character set to include French, German etc characters like : ê, å, ñ, etc.

Now the problem with ASCII is it was designed for our Latin alphabet and as we have seen can be extended to cover Western European character sets, but what about Urdu, Arabic, Chinese and so on.

So ASCII is being superseded by Unicode. Unicode is a 16 bit code giving 65 536 characters which is enough to include all the world’s alphabets. ASCII forms the first 128 characters and extended ASCII forms the next 128 codes.

So Unicode has the advantage of including existing ASCII but extends to all character sets, so you can code every alphabet in the world (including ancient unused ones). The disadvantage is text takes up twice as much storage or twice as much bandwidth.

EXERCISE 3

Describe how very large or small numbers are stored.
What is the effect of increasing the number of bits allocated to the mantissa?
When running a programming you can get an ‘overflow’ error. What do you think this error means in relation to floating point storage?
Give one advantage and one disadvantage of using Unicode rather than ASCII.
How many different characters (or ‘glyphs’) can be stored in Unicode?
How many bytes of storage would be needed for the sentence inside the rectangle below if it was stored in Unicode?

GRAPHICS

There are two methods of storing graphics:- bitmap and vector.

BITMAP

In Maths a mapping is a correspondence between two sets like whole numbers and their squares.

In Black & White (monochrome) graphics there is a simple 1 – 1 correspondence between each pixel and each bit.

For colour we can use more than 1 bit per pixel (called the colour depth or bit depth). There will still be a mapping, but now 1 pixel will map to more than 1 bit.

If we have 4 colours you will need 2 bits per pixel, possibly like this:

00011011

Now the bit map for this: will be this:

01001011

00010000

11110010

10000100

If you use n bits you will get 2n different colours or shades, n is called the colour depth or bit depth. So:

No. of Bits / No. of Colours
1 / 2
2 / 4
4 / 16
8 / 256
16 / 65 536
24 / 16 777 216
n / 2n

24 bit colour depth is also called true colour as that is the maximum number of shades the human eye can distinguish.

The actual bitmaps are stored in the VRAM (Video RAM) on the graphics card. The amount of VRAM on the card determines the maximum colour depth and resolution (number of pixels) your screen can display.

You can easily calculate the storage requirements for a screen using the formula:

No. of BITS = Pixels across x Pixels down x bit depth.

Example 1: A 1280 by 800 screen using 8 bit colour depth needs:

1280 x 800 x 8 BITS

= 8192000 bits

/8 = 1024000 bytes

/1024 = 1000 Kbytes

Example 2: Calculate the storage required for a screen using 1620 by 1280 pixels with 65 536 colours.

Bits = 1620 x 1280 x 16 (because 216 = 65 536)

= 33177600 bits

/8= 4147200 bytes

/1024= 4050 Kbytes

/1024= 3.96 Mbytes

Note how the number of bits per pixel increases the file size. 24 bit colour produces a file 3 times larger than 8 bit colour.

VECTOR GRAPHICS

The second way of storing graphics is vector or object orientated graphics. Here graphics are stored by their objects and their attributes.

In a graphics package you have a set of drawing tools. You can choose rectangle, circle, polygon etc. You can also choose colours for the lines, fill patterns, line thickness and so on.

The objects are rectangle, circle etc. The attributes are line colour, line thickness, fill patters etc.

Here in Fireworks I have drawn a circle choosing the object here.

The attributes (or properties) can be changed here.

Word uses vector graphics, Paint uses bitmap.

Vector (or object orientated) stores graphics by objects and their attributes. This is just a list of numbers:

3, 100,200,250,3,4,2,1 where :

3 could be a circle

100,200 its centre

250 the radius

3 the line thickness

4 line colour

2 fill pattern

1 the layer

You should get a feel yourself for the differences between vector and bitmap by trying Appleworks where Draw is vector, Paint is bitmap or the Paint program on a PC compared to Word graphics.

You can also examine the size of any file you save by right clicking and choosing properties. Bitmaps are always big files.

Differences between bitmap and vector:

First of all you need to understand the effect of enlarging a graphic:

Scaling a bit map in a painting package is done by applying a scale factor to each pixel. For an enlargement, there is the same number of pixels as in the original, each pixel just gets bigger.

A screen’s resolution might be 100 dpi. If you print that on a printer with 600 dpi then each pixel is scaled by a factor of 6, i.e. it gets larger and appears ‘blocky’.

With vector, the computer is using a sort of formula to draw each object, it can use all 600 dots on the printer so in fact the graphic actually gets finer and improves.

Bitmap is called resolution dependent and vector is resolution independent.

On the next page are printouts to show the differences, both of these graphics looked the same on screen.

Advantages of bit mapped graphic representation

A bit mapped image can be manipulated at the pixel level. Thus a designer may apply particular colour values to a selected pixel area to produce shading or texture effects. You can use spray cans, you can use an eraser.

It is possible to create a wider range of irregular shapes and patterns by simply deleting pixels or adding pixels anywhere on the image.

Disadvantages of bit mapped representation

Requires large amounts of storage space; image becomes ‘blocky’ when scaled;

does not take advantage of resolutions that are higher than the resolution of the image.

Advantages of vector graphic representation:

Requires less storage space than a bit mapped image;

They can be edited at the "object" level, thus allowing the user to reposition, scale and delete entire objects, or groups of objects, with ease;

Objects can be grouped to form larger objects that can then be manipulated as a single image;

Objects can be layered.

Images are resolution independent meaning that they can use the full quality of the display or print device.

BITMAP / VECTOR
FOR / AGAINST / FOR / AGAINST
Edit individual pixels. / Large amount of storage required / Less storage required. / Cannot edit pixels
Can create irregular shapes and lines. / Resolution dependent, loses quality on printer. / Edit individual objects, e.g. move, scale, change colour, layer. / Do not have eraser or spray can.
Use brush and spray can for paint effects. / Resolution independent, can use full quality of a printer.

Because bitmaps take up so much storage space you often need to compress files for saving on digital camera storage cards or displaying on the Web. Specialised compression result in images that are indistinguishable from the original as far as the human eye is concerned, but much smaller file sizes. Two standards that will produce compressed image files are GIF (Graphic Interchange Format), better suited to drawings and cartoons that have only a few colours in them, and JPEG (Joint Pictures Expert Group) which can compress as much as 10 times more than GIF and is also more suitable for photo-realistic images.

EXERCISE 4

What does a ‘bitmap’ mean?
How many colours can be displayed using a colour depth of 8 bits?
Calculate the storage required for a black and white (monochrome) screen using 800 by 600 pixels.
Calculate the storage required for a screen using 2048 by 1600 pixels with true colour.
How much storage would be required for a scan of a 6” by 4” photograph at 400 dpi using 65 536 colours?
What does the term resolution independent mean?
Name two operations you can carry out on a vector graphic that cannot be done on a bitmap.
If a systems analyst has to work out how much backing storage to have for a system which uses a great number of graphics, why might they prefer to know that only bitmaps are used on the system.
Give one advantage and one disadvantage of bitmap graphics.
Give one advantage and one disadvantage of vector graphics.
An animated gif has 12 frames each 40 by 40 pixels using 256 shades of grey. Calculate the total storage for the gif animation.

Summary