CSSS 508: Intro R
2/08/06
Creating Space vs. Starting with a Null
When you create space, you’re making the variable using placeholders of zero, NA, etc. Then you go through and replace the spaces in your variable.
mean.vector<-rep(0,10)
mean.vector
0 / 0 / 0 / 0 / 0 / 0 / 0 / 0 / 0 / 0for(i in 1:10){
mean.vector[i]<-mean(matrix[i,])
}
What happens:
i = 1
mean.vector[1]<-mean(matrix[1,])
mean.vector
2 / 0 / 0 / 0 / 0 / 0 / 0 / 0 / 0 / 0i = 2
mean.vector[2]<-mean(matrix[2,])
mean.vector
2 / 3 / 0 / 0 / 0 / 0 / 0 / 0 / 0 / 0i= 3
mean.vector[3]<-mean(matrix[3,])
mean.vector
2 / 3 / 1 / 0 / 0 / 0 / 0 / 0 / 0 / 0Etc.
Since you have already created the space, in a for loop, you have to tell it where to plug in the results (i.e. index by [i]).
When you start with a NULL, you add a piece to your variable with each loop.
mean.vector<-NULL
mean.vector
for(i in 1:10){
mean.vector<-c(mean.vector,mean(matrix[i,]))
}
What happens:
i = 1
mean.vector<-c(mean.vector,mean(matrix[1,]))
mean.vector
2i = 2
mean.vector<-c(mean.vector,mean(matrix[2,]))
mean.vector
2 / 3i= 3
mean.vector<-c(mean.vector,mean(matrix[3,]))
mean.vector
2 / 3 / 1Etc.
Since you’re creating the space as you go, you’re adding on elements at the end during a for loop and so don’t need the explicit indexing by [i].
Common Mistakes
Just a few common mistakes that have occurred:
1) Nothing indexed in the for loop:
new.matrix<-matrix(0,n,p)
for(i in 1:n){
new.matrix<-matrix(rnorm(p,mean.vector,1),n,p)
}
This creates a new matrix every time you go through the for loop instead of creating one row at a time. The last matrix you create will be the one that gets returned.
You have a vector of means, one for each row of the matrix.
Because the changing characteristic is on the rows, you must index on the rows.
new.matrix<-matrix(0,n,p)
for(i in 1:n){
new.matrix[i,]<-rnorm(p,mean.vector[i],1)
}
Here we have created one row’s worth of data at a time and plugged into a row in the already created matrix. We could also use rbind:
new.matrix<-NULL
for(i in 1:n){
new.matrix<-rbind(new.matrix, rnorm(p,mean.vector[i],1))
}
Here there was no space created ahead of time; just the variable name.
So we’re adding one row at a time to the new.matrix. In the first loop, the new.matrix has one row; in the second loop, the new.matrix has two rows; etc.
In the above example, we don’t have an [i] (or indexing) but we are adding a row in every loop. So the indexing is subtle – the ith row belongs to the ith loop.
2) A quick note on rbind and cbind:
You only use rbind and cbind when you are binding two things together.
rbind(new.matrix, rnorm(10,0,1) needed
rbind(rnorm(10,0,1)) not needed; aren’t putting two things together.
Same thing with c:
You only need c( ) if you are putting numbers together that aren’t already in a vector.
c(2,3,1,5,6) needed
c(rnorm(10,0,1)) not needed; rnorm() puts things in a vector
3) Losing track of rows vs. column vs. elements:
When working with a matrix: the matrix is indexed by matrix[row#, col#].
matrix[1,] – the first rowmatrix[,1] – the first column
matrix[1] – the first element.
Make sure that what you’re plugging in matches the space of where you’re sending it.
Let’s say we have a matrix with n rows and p columns:
matrix[i]<-rnorm(n, mean[i],sd[i])
(Bad: plugging in an n-long vector into the space for one element)
matrix[i,] <- rnorm(n, mean[i], sd[i])
(Bad: plugging in an n-long vector into a row with space for p elements)
matrix[,i] <-rnorm(n, mean[i], sd[i])
(Good: plugging in an n-long vector into a column with space for n elements)