Data types and coercion

As with most programming languages R uses the main input types of:

  • int integer e.g. 10
  • str string/character e.g. ‘Hello’
  • complex complex e.g. 3 + 4i
  • double double e.g. 18.323342
  • boolean boolean/logical e.g. TRUE
  • factor factor e.g. red, green, blue => 1, 2, 3

It is important to know of what type your data is. Within the R Studio environment this process is simplified as the Environment pane allows you to see the object type for most variables.

From the command line you can also interrogate your variables and objects. Some useful commands are exemplified below.

Firstly we can find out the class of an object or variable by using class() function. We enter the variable or object name as the argument to the function.


Input:
x <- c(1,2,3,4)
class(x)

y <- c('red','green','blue','yellow')
class(y)

Output:
'numeric'

'character'

It is also useful to know what the structure of an object such as a dataframe or list might be. For this we can use the str() function again entering the variable/object name as the argument to the function.


Input:
data <- seq(1,50,2)
y <- matrix(data,5,5,byrow=TRUE)
str(y)

data2 <- data.frame(ID=seq(1,25,1),
                   ScoreOne=rpois(25,20),
                   ScoreTwo=rpois(25,35))
str(data2)

Output:
 num [1:5, 1:5] 1 11 21 31 41 3 13 23 33 43 ...

'data.frame':	25 obs. of  3 variables:
 $ ID      : num  1 2 3 4 5 6 7 8 9 10 ...
 $ ScoreOne: int  24 15 31 20 20 27 30 16 23 26 ...
 $ ScoreTwo: int  35 35 36 44 43 37 30 34 33 31 ...

To make the best use of your data it is often necessary to coerce it into a different object class. This is achieved using the functions:

  • as.character()
  • as.integer()
  • as.double()
  • as.logical()
  • as.numeric()
  • as.factor()

Some examples of the use of these functions are given below.


Input:
x <- c(1,2,3,4)
class(x)

Output:
'numeric'

Input:
x <- as.character(x)
class(x)

Output:
'character'

Input:
y <- c('red','green','blue','yellow')
class(y)

Output:
'character'

Input:
y <- as.factor(y)
class(y)
str(y)

Output:
'factor'
 Factor w/ 4 levels "blue","green",..: 3 2 1 4

Input:
z <- c("TRUE","FALSE","TRUE","TRUE","FALSE","FALSE")
class(z)

Output:
'character'

Input:
z <- as.logical(z)
class(z)

Output:
'logical'