Data structures and conversion

Arrays are single or multi-dimensional data structures. A one dimensional array is also called a vector, and a 2 dimensional array is a matrix.

Vectors

A vector, or one-dimensional array, can be created in a number of ways. for example a numerical vector of successive integers can be created as shown below.


Input:
a<-1:5
a

Output:
1 2 3 4 5

A vector of strings can be created using the c() concatenation operator.


Input:
b<-c("Fish","Cat","Dog")
b

Output:
'Fish' 'Cat' 'Dog'

A vector can be extended...


Input:
a[6]<-6
a

Output:
1 2 3 4 5 6

Or entries changed.


Input:
a[4]<-11
a

Output:
1 2 3 11 5 6

This is covered in more detail in the 'Subscripting and subsetting' section

Arrays

When creating multi-dimensional arrays, we use the array() function, and must specify the dimensions. The syntax is dim(rows, columns). Note the order in which the array fills.


Input:
x<-array(1:20,dim=c(4,5)) 
x

Output:
1	5	9	13	17
2	6	10	14	18
3	7	11	15	19
4	8	12	16	20

We can extract single entries by specifying the row and column of the entry.


Input:
x[1,5]

Output:
17

Or a whole row, or column.


Input:
x[,1]
x[4,]

Output:
1 2 3 4
4 8 12 16 20

In the example below, we create a 3 dimensional array from which we can extract a single entry, a row, a column,
or an entire matrix.


Input:
# Creates a 2 x 3 matrix populated with 7's
a <- matrix(7, 4, 5)
a
# Creates a 2 x 3 matrix populated with 9's.
b <- matrix(9, 4, 5)
b
# Creates a 4 x 5 x 2 array with two levels,
# one containing 7's and the other containing 9's
c<-array(c(a, b), c(4, 5,2))

c[1,4,1]<-4 # the entry of row 1, column 4, matrix 1
c[,3,1]<-1  # the entire column 3
c[2,,1]<-2 # row 2 of matrix 1
c[,,1] # print the whole matrix 1
c[,,2] # print the whole matrix 2

Output:
7	7	7	7	7
7	7	7	7	7
7	7	7	7	7
7	7	7	7	7

9	9	9	9	9
9	9	9	9	9
9	9	9	9	9
9	9	9	9	9

7	7	1	4	7
2	2	2	2	2
7	7	1	7	7
7	7	1	7	7

9	9	9	9	9
9	9	9	9	9
9	9	9	9	9
9	9	9	9	9

Dataframes

A dataframe is an R object that holds data of different types and it to be accessed in a different way to vectors and arrays which can make analysis tasks easier. Because R is not a spreadsheet, it doesn't automatically recognise data as such. To create a dataframe, use the data.frame() function. The object from which a dataframe is to be created can be a vector/matrix/array or source file.


Input:
data<-as.data.frame(matrix(1:4,nrow=2))
data

Output:
V1	V2
1	3
2	4

R names columns in the format 'V1', 'V2' etc. by default. We can add our own column names to a dataframe as shown below. We create a vector of column names and insert them as the dataframe column names using the function colnames(). Notice that the function is on the left-hand side of the assignment operator (<-), this means that the object names is being assigned to the object data via the colnames() function.


Input:
a<-matrix(1:6,nrow =2)
names<-c("ID","height","weight")
data<-as.data.frame(a)
colnames(data)<-names
data

Output:
ID	height	weight
1	3	5
2	4	6

We can also convert a dataframe to a matrix using the as.matrix() function


Input:
matData <- as.matrix(data)
matData

Output:
     ID height weight
[1,]  1      3      5
[2,]  2      4      6

When (part of) a column or row is extracted from a dataframe it automatically becomes a vector