Arrays are single or multi-dimensional data structures. A one dimensional array is also called a vector, and a 2 dimensional array is a matrix.
Vectors
A vector, or one-dimensional array, can be created in a number of ways. for example a numerical vector of successive integers can be created as shown below.
Input:
a<-1:5
a
Output:
1 2 3 4 5
A vector of strings can be created using the c()
concatenation operator.
Input:
b<-c("Fish","Cat","Dog")
b
Output:
'Fish' 'Cat' 'Dog'
A vector can be extended...
Input:
a[6]<-6
a
Output:
1 2 3 4 5 6
Or entries changed.
Input:
a[4]<-11
a
Output:
1 2 3 11 5 6
This is covered in more detail in the 'Subscripting and subsetting' section
Arrays
When creating multi-dimensional arrays, we use the array() function, and must specify the dimensions. The syntax is dim(rows, columns). Note the order in which the array fills.
Input:
x<-array(1:20,dim=c(4,5))
x
Output:
1 5 9 13 17
2 6 10 14 18
3 7 11 15 19
4 8 12 16 20
We can extract single entries by specifying the row and column of the entry.
Input:
x[1,5]
Output:
17
Or a whole row, or column.
Input:
x[,1]
x[4,]
Output:
1 2 3 4
4 8 12 16 20
In the example below, we create a 3 dimensional array from which we can extract a single entry, a row, a column,
or an entire matrix.
Input:
# Creates a 2 x 3 matrix populated with 7's
a <- matrix(7, 4, 5)
a
# Creates a 2 x 3 matrix populated with 9's.
b <- matrix(9, 4, 5)
b
# Creates a 4 x 5 x 2 array with two levels,
# one containing 7's and the other containing 9's
c<-array(c(a, b), c(4, 5,2))
c[1,4,1]<-4 # the entry of row 1, column 4, matrix 1
c[,3,1]<-1 # the entire column 3
c[2,,1]<-2 # row 2 of matrix 1
c[,,1] # print the whole matrix 1
c[,,2] # print the whole matrix 2
Output:
7 7 7 7 7
7 7 7 7 7
7 7 7 7 7
7 7 7 7 7
9 9 9 9 9
9 9 9 9 9
9 9 9 9 9
9 9 9 9 9
7 7 1 4 7
2 2 2 2 2
7 7 1 7 7
7 7 1 7 7
9 9 9 9 9
9 9 9 9 9
9 9 9 9 9
9 9 9 9 9
Dataframes
A dataframe is an R object that holds data of different types and it to be accessed in a different way to vectors and arrays which can make analysis tasks easier. Because R is not a spreadsheet, it doesn't automatically recognise data as such. To create a dataframe, use the data.frame()
function. The object from which a dataframe is to be created can be a vector/matrix/array or source file.
Input:
data<-as.data.frame(matrix(1:4,nrow=2))
data
Output:
V1 V2
1 3
2 4
R names columns in the format 'V1', 'V2' etc. by default. We can add our own column names to a dataframe as shown below. We create a vector of column names and insert them as the dataframe column names using the function colnames()
. Notice that the function is on the left-hand side of the assignment operator (<-), this means that the object names is being assigned to the object data via the colnames()
function.
Input:
a<-matrix(1:6,nrow =2)
names<-c("ID","height","weight")
data<-as.data.frame(a)
colnames(data)<-names
data
Output:
ID height weight
1 3 5
2 4 6
We can also convert a dataframe to a matrix using the as.matrix()
function
Input:
matData <- as.matrix(data)
matData
Output:
ID height weight
[1,] 1 3 5
[2,] 2 4 6
When (part of) a column or row is extracted from a dataframe it automatically becomes a vector