CSSS 508: Intro R

2/08/06

Creating Space vs. Starting with a Null

When you create space, you’re making the variable using placeholders of zero, NA, etc. Then you go through and replace the spaces in your variable.

mean.vector<-rep(0,10)

mean.vector

0 / 0 / 0 / 0 / 0 / 0 / 0 / 0 / 0 / 0

for(i in 1:10){

mean.vector[i]<-mean(matrix[i,])

}

What happens:

i = 1

mean.vector[1]<-mean(matrix[1,])

mean.vector

2 / 0 / 0 / 0 / 0 / 0 / 0 / 0 / 0 / 0

i = 2

mean.vector[2]<-mean(matrix[2,])

mean.vector

2 / 3 / 0 / 0 / 0 / 0 / 0 / 0 / 0 / 0

i= 3

mean.vector[3]<-mean(matrix[3,])

mean.vector

2 / 3 / 1 / 0 / 0 / 0 / 0 / 0 / 0 / 0

Etc.

Since you have already created the space, in a for loop, you have to tell it where to plug in the results (i.e. index by [i]).

When you start with a NULL, you add a piece to your variable with each loop.

mean.vector<-NULL

mean.vector

for(i in 1:10){

mean.vector<-c(mean.vector,mean(matrix[i,]))

}

What happens:

i = 1

mean.vector<-c(mean.vector,mean(matrix[1,]))

mean.vector

2

i = 2

mean.vector<-c(mean.vector,mean(matrix[2,]))

mean.vector

2 / 3

i= 3

mean.vector<-c(mean.vector,mean(matrix[3,]))

mean.vector

2 / 3 / 1

Etc.

Since you’re creating the space as you go, you’re adding on elements at the end during a for loop and so don’t need the explicit indexing by [i].

Common Mistakes

Just a few common mistakes that have occurred:

1) Nothing indexed in the for loop:

new.matrix<-matrix(0,n,p)

for(i in 1:n){

new.matrix<-matrix(rnorm(p,mean.vector,1),n,p)

}

This creates a new matrix every time you go through the for loop instead of creating one row at a time. The last matrix you create will be the one that gets returned.

You have a vector of means, one for each row of the matrix.

Because the changing characteristic is on the rows, you must index on the rows.

new.matrix<-matrix(0,n,p)

for(i in 1:n){

new.matrix[i,]<-rnorm(p,mean.vector[i],1)

}

Here we have created one row’s worth of data at a time and plugged into a row in the already created matrix. We could also use rbind:

new.matrix<-NULL

for(i in 1:n){

new.matrix<-rbind(new.matrix, rnorm(p,mean.vector[i],1))

}

Here there was no space created ahead of time; just the variable name.

So we’re adding one row at a time to the new.matrix. In the first loop, the new.matrix has one row; in the second loop, the new.matrix has two rows; etc.

In the above example, we don’t have an [i] (or indexing) but we are adding a row in every loop. So the indexing is subtle – the ith row belongs to the ith loop.

2) A quick note on rbind and cbind:

You only use rbind and cbind when you are binding two things together.

rbind(new.matrix, rnorm(10,0,1) needed

rbind(rnorm(10,0,1)) not needed; aren’t putting two things together.

Same thing with c:

You only need c( ) if you are putting numbers together that aren’t already in a vector.

c(2,3,1,5,6) needed

c(rnorm(10,0,1)) not needed; rnorm() puts things in a vector

3) Losing track of rows vs. column vs. elements:

When working with a matrix: the matrix is indexed by matrix[row#, col#].

matrix[1,] – the first rowmatrix[,1] – the first column

matrix[1] – the first element.

Make sure that what you’re plugging in matches the space of where you’re sending it.

Let’s say we have a matrix with n rows and p columns:

matrix[i]<-rnorm(n, mean[i],sd[i])

(Bad: plugging in an n-long vector into the space for one element)

matrix[i,] <- rnorm(n, mean[i], sd[i])

(Bad: plugging in an n-long vector into a row with space for p elements)

matrix[,i] <-rnorm(n, mean[i], sd[i])

(Good: plugging in an n-long vector into a column with space for n elements)