ARRAYS

Arrays are usually a feature found in many programming languages. These are combinations of simple types found in the language and are called structured types. They are really a special type of data structure which has some abstractions built into the language that allow the user to use the structure easily. Arrays are also an Abstract Data Type and can be defined by the processes which make up the structure. These are:

  1. Create : Allocate the space required for storing the array.
  2. Insert: Insert values into specified elements of the array.
  3. Retrieve: Obtain values for specified elements of the array.

When arrays are stored, they are generally contiguous in memory, that is, one element follows directly after the other in one large block of memory. The system generally stores the name of the array, the point in memory where the array begins, the number of elements in the array, and the size of each element. The size of the element is determined by analyzing the type of the element which can be quite simple or vary in complexity. From a user’s point of view, all that is required is to specify the name of the array and the element number. Elements are usually numbered beginning at 0 and are said to be zero-indexed. The structure’s design is such that the user does not need to know how to get to the location requested, this is done by the system.

Mapping Functions

One-dimensional arrays

The mapping function is used by the system to find the beginning of the specified array element. Suppose that we have an array of 4 elements called ‘one’. We can picture this as

The system will know its name, the beginning point in memory and the number of elements. However, finding any element will depend on how far it moves down this chunk of memory, so the size of each element must be known and is determined by the type declaration when the array is declared. If the array above stores characters, then 1 byte is required for each element. If we were to store longs, 8 bytes would be required for each element. In order to make sure the system is looking at the beginning of the byte requested, it uses a mapping function. In general, the mapping function would be

position = start + (size of element * number of element)

Thus, if the array contains longs, then finding the start of element # 2 would be

Position = start + (8 * 2)

or 16 bytes beyond the start position. Note that this is the beginning of element #2 (which is actually the third element in the array).

Two-dimensional Arrays

The mapping function for one-dimensional arrays is quite simple. However, when we want to store two-dimensional arrays, there are other considerations. The first is that the system still stores the array as if it were one long piece of memory. Another way of looking at this is that each row is really a one-dimensional array so it is an array of one-dimensional arrays.

We already know how to do the mapping function of a one-dimensional array, in other words to find the column. What we need is to combine this with the mapping function that will tell us where the specific row starts. Obviously the size of each row is determined by the size and number of the columns.

Cs * NC

Then the position where the row starts is found by multiplying the specified row number by the row size.

Rnum * (Cs * NC)

If we know where the row starts then we can just add to this the part of the mapping function that locates the beginning of the column

Position = start + (Rnum * (Cs * NC)) + (Cnum * Cs)

Then to find element two[1][2] we find that the position from the start of the array is

Start + (1 * (1 * 3)) + (2 * 1)

or 5 bytes from the beginning of the array.

Jagged Arrays

Jagged arrays are arrays where each row can have a different number of columns. This can be useful since some rows can be almost empty while other rows have a large number of columns. If we make a rectangular two-dimensional array, then a lot of space will be wasted. However, in order to do this, we need to add an additional one-dimensional array of a length equal to the number of rows. This array stores the number of columns in each of the rows.

All of the elements in the array are 1 byte in length. If we can find the correct position for the beginning of a particular row, then the mapping function for a one-dimensional array can be used for this portion without any changes. However, to find the starting point is not as simple. What we really need to know is how many elements are stored done by summing the values in the array which contains the information about the number of columns in each row. between the starting point and the beginning of the row that we are interested in. This is

Position = start +{ Cs * }+ {Cs *Cnum }

So, finding the beginning of item [3][0], we would sum the values from 0 to 2 in the array that has the information about the number of columns in rows 0 to 2. In this case this is 6. Then multiplying 1 byte by 6 and adding the 0 for the 0 columns we find that we must move in 6 bytes from the start of the array.

Row-Major and Column-Major Addressing

There are two methods for storing, and therefore addressing two-dimensional arrays. These are row-major and column-major. So far, and in most cases, programmers and languages use a row-major method. This is the one that has appeared here and is characterized by storing the array as a set of columns. Another way of looking at it is that we have an array of rows, each composed of an array of columns. On the other hand, we could store the array as column-major. In this case we have an array of columns, each containing an array of rows. You should note that we can refer to an element as x[row][column] regardless of the way the information is stored. The mapping function takes care of the storage and the code copes with differences. As an example, a row-major scheme that prints an array in row-major order would look something like this:

for row = 1 to m do

for column = 1 to n do

print A[row, column]

end

end

The major loop is the row loop. A column-major scheme would have code that looks something like the following:

for column = 1 to n do

for row = 1 to m do

print A [row, column]

end

end

The major loop is the column loop.

Implementing Arrays using Vectors of Vectors

An alternative to using contiguous memory for arrays might be valuable. Arrays are assigned at execution time and therefore are of a fixed size. However, vectors can be dynamically allocated as the need arises. In Java, you can specify the size of the vector, or you can leave it open. If you leave it open, the size is initially 10. If more is needed, the size is doubled. There are a number of things that you can do to control the size and what happens when new elements are added and you should check the vector library of Java to see what they are. This is also covered in the text.

A two-dimensional array stored as vectors uses a one-dimensional array of references to represent the rows. Each reference refers to a vector which contains the columns for the array. In this way, the number of rows and columns is dynamic rather than static. You can also pass a two-dimensional array as a parameter in a function, but you need only pass the vector which has the row references. For and NxN array, the storage requirements for contiguous arrays is N2 elements. For the vector method there is an additional N that must be added, but both are O(n2).