Chapter2 Value and Object Types
In the previous chapter, we focused on how to use R for running code, assigning variables, creating vectors, and producing plots. In this chapter, we shift our attention to what R is actually working with. That is, the types of values R understands and the types of objects it uses to store and manipulate data.
Understanding value types and object types is essential for writing correct, efficient, and readable R code. Many errors encountered by beginners do not always come from incorrect syntax, but from a misunderstanding of how R is interpreting the data provided.
2.1 Value types
A value type refers to the fundamental nature of a single piece of information stored in R. R has several basic value types, but we will focus on the most commonly used:
- Double (numeric)
- Integer
- Character
- Logical
- Date/Time
You can always inspect the value type of an object using the function str().
2.1.1 Double (numeric) values
Double values (often simply referred to as numeric values) represent real numbers and are the default numeric type in R:
## num 3.14
## num 1.41
2.1.2 Integer values
Integers represent whole numbers (i.e. numbers without decimal places):
## num 5
Although this may appear to be an integer, R actually treats numeric values as double precision numbers by default. To explicitly create an integer value, you must use as.integer():
## int 5
Integers are commonly used when working with indices, loops, or counts (see later chapters).
2.1.3 Character values
We have already discussed these when we introduced character strings:
## chr "10"
## num 10
Character values are especially important when working with labels, names, categories, or textual data.
2.1.4 Logical (Boolean) values
Logical or Boolean values represent truth values and can take one of two forms:
- TRUE
- FALSE
Logical values often arise in R from comparisons and play a central role in filtering data and controlling the flow of code, which will be explored in later chapters:
## [1] TRUE
## [1] FALSE
2.1.5 Date and date-time values
Working with dates and times is extremely common when dealing with real-world data and R has specialised value types for working with both. The most common are:
Datefor calendar datesPOSIXct/POSIXltfor date-time objects
Dates are typically created using the as.Date() function and Date-Times using as.POSIXct():
## [1] "2026-01-01"
## Date[1:1], format: "2026-01-01"
## [1] "2026-01-01 12:30:00 GMT"
## POSIXct[1:1], format: "2026-01-01 12:30:00"
It is also possible to create and assign the current time (note that the date and time below will display the date these notes were rendered):
## [1] "2026-02-02 12:30:43 GMT"
## POSIXct[1:1], format: "2026-02-02 12:30:43"
2.2 Object types
An object type refers to the structure used by R to store one or more values. Objects can contain single values or many values arranged in different ways. The most important object types we will encounter are:
- Numeric and Character objects
- Vectors
- Factors
- Matrices
- Data Frames
- Lists
2.2.1 Numeric and character objects
The simplest objects in R are stored as a single value. Even though these appear simple, R still treats these as vectors but of length 1:
## [1] 1
## [1] 1
## num 10
## chr "Hello"
This design choice allows R to apply so-called vectorised calculations/operations consistently (as we saw in the previous chapter).
2.2.2 Vectors
We have encountered vectors and seen different methods of creating them. However, one important point that we did not make earlier is that, by definition, vectors MUST contain values of the same type! Attempting to mix values types within a vector will cause coercion and R will automatically convert them all to a single type (usually the most flexible):
## chr [1:3] "1" "A" "TRUE"
2.2.3 Factors
Factors are used to represent categorical data. Although they may look like character vectors, they are stored internally as integers with associated labels. This distinction is important when it comes to using factors as variables in statistical models:
## Factor w/ 2 levels "Female","Male": 2 1 1 2
To inspect the different categories (levels) of the factor variable(s), we can use the levels() function:
## [1] "Female" "Male"
2.2.4 Matrices
We have already discussed how R can combine a series of values as vectors which can be used in calculations or other operations. R can just as easily deal with matrices and, use these to perform matrix calculations, and even solve problems from linear algebra.
As with vectors, there are actually a number of different ways to create matrices in R, but let us begin by looking at the matrix() function and using the query command, i.e., ?matrix() (alternatively via the help tab) for more information. Doing so shows that the general form of the matrix() function is given by
matrix(data = , nrow = , ncol = , byrow = , dimnames = )
where each of the arguments are defined as follows:
- data - This is a vector of data that is used to create the elements of the matrix itself
- nrow - This specifies the number of rows desired for the matrix
- ncol - This specifies the number of columns desired for the matrix.
- byrow - This argument instructs R on how to fill the matrix using the data vector. If it takes the value of
TRUE, then the elements will be filled row-wise, i.e. will first fill all the first row, then move down to second row etc, and ifFALSE, the vice-versa. - dimnames - This argument allows you to assign names to the rows and columns of the matrix.
Note that if either nrow or ncol is not given, then R will try to guess the required value(s) and will fill any unspecified elements by repeating the original data vector until filled.
Example 2.1 Consider the following two matrices
\[\begin{equation*} A = \left( \begin{array}{cc} 3 & 4 \\ 6 & 2 \end{array} \right) \quad \text{and} \quad B = \left( \begin{array}{cc} 1 & 5 \\ 4 & 6 \end{array} \right). \end{equation*}\]
These can be created in R using the matrix() function as follows:
## [,1] [,2]
## [1,] 3 6
## [2,] 4 2
## [,1] [,2]
## [1,] 1 4
## [2,] 5 6
It is important to note that like vectors, the elements/values of a matrix must all be of the same type! This is typically why matrices are not a good choice of object when storing data, and data frames (see below) are preferred.
2.2.5 Matrix calculations
Now that you have created your matrices and assigned them as variables \(A\) and \(B\), you can use them in calculations:
## [,1] [,2]
## [1,] 4 10
## [2,] 9 8
## [,1] [,2]
## [1,] -2 -2
## [2,] 1 4
## [,1] [,2]
## [1,] 3 24
## [2,] 20 12
## [,1] [,2]
## [1,] 3.0 1.5000000
## [2,] 0.8 0.3333333
** BE CAREFUL!** Notice that all the calculations have been done element-wise. As with the vectors, this turns out to be a very helpful tool within R although it might not appear so just now.
If you want to apply matrix-multiplication you have to use a slightly different command:
## [,1] [,2]
## [1,] 33 48
## [2,] 14 28
2.2.6 Matrix operations
There are, of course, an array of other calculations you may apply when working with matrices e.g, determinant, inverse, transpose etc. Rather than showing each of these in turn, in this section we simply provide a table of the different matrix/vector operations that can be used in R, with a brief description of what each of them are used for. We suggest that you try these out for yourself in order to familiarise yourself and understand how they work and don’t forget to use the ‘Help’ function if you’re unsure. Once you have mastered these operations, have a go at the exercises in the next section.
In the following table, A and B denote matrices, whilst x and b denote vectors:
| Operation | Description |
|---|---|
A + B |
Element-wise sum |
A - B |
Element-wise subtraction |
A * B |
Element-wise multiplication |
A %*% B |
Matrix multiplication |
t(A) |
Transpose |
diag(x) |
Creates diagonal matrix with elements of x on the main diagonal |
diag(A) |
Returns a vector containing the elements of the main diagonal of A |
diag(k) |
If k is a scalar, this creates a \((k x k)\) identity matrix |
solve(A) |
Inverse of A where A is a square matrix |
solve(A, b) |
Solves for vector x in the equation \(A\vec{x}\vec{b}\) (i.e. \(\vec{x} = A^{-1}\vec{b}\)) |
cbind(A,B,...) |
Combines matrices(vectors) horizontally and returns a matrix |
rbind(A,B,...) |
Combines matrices(vectors) vertically and returns a matrix |
rowMeans(A) |
Returns vector of individual row means |
rowSums(A) |
Returns vector of individual row sums |
colMeans(A) |
Returns vector of individual column means |
colSums(A) |
Returns vector of individual column sums |
Recall that vectors are just particular cases of matrices with either one row or one column. Therefore, it should not be a surprise to find that you can create a vector using the matrix function. To do this, simply set the nrow or ncol argument equal to 1, depending on format of vector you want (row or column vector).
The only slight restriction to simply using the c() function, is that R will always saves the vector as a column vector. We point out here that this might not be so clear when you first define the vector in R, as the output given in the console looks like the form of a row vector. To overcome this, you can simply turn the column vector (default when using combine function in R) into a row vector by performing the transpose (see table above) of the original vector.
2.2.7 Data frames
A data frame is the most commonly used object for storing data sets in R. It has a matrix/table-like structure, where
- Columns can be of different types
- Rows represent different observations
## id name score
## 1 1 A 65
## 2 2 B 70
## 3 3 C 82
## 4 4 D 90
## 'data.frame': 4 obs. of 3 variables:
## $ id : int 1 2 3 4
## $ name : chr "A" "B" "C" "D"
## $ score: num 65 70 82 90
Data frames form the backbone of most statistical analyses in R and we will work with them alot more in the following chapters.
2.2.8 Lists
Finally, a list is the most flexible object type in R. Unlike vectors and matrices, lists can contain elements of different types and structures and thus, are often used to store complex outputs from functions:
## $numbers
## [1] 1 2 3 4 5
##
## $text
## [1] "Hello"
##
## $matrix
## [,1] [,2]
## [1,] 3 6
## [2,] 4 2
##
## $data
## id name score
## 1 1 A 65
## 2 2 B 70
## 3 3 C 82
## 4 4 D 90
## List of 4
## $ numbers: int [1:5] 1 2 3 4 5
## $ text : chr "Hello"
## $ matrix : num [1:2, 1:2] 3 4 6 2
## $ data :'data.frame': 4 obs. of 3 variables:
## ..$ id : int [1:4] 1 2 3 4
## ..$ name : chr [1:4] "A" "B" "C" "D"
## ..$ score: num [1:4] 65 70 82 90
2.2.9 Converting between object types
In some cases, you may wish to convert objects into different object types. This is especially useful when you want to use an object within a specified function and the function only allows (or expects) inputs to be of a specific type. To convert between objects, we can use one of many different as.*() type functions:
## [1] 10
## [1] "100"
## [1] Low Medium High
## Levels: High Low Medium
## id name score
## [1,] "1" "A" "65"
## [2,] "2" "B" "70"
## [3,] "3" "C" "82"
## [4,] "4" "D" "90"
## num [1:2, 1:2] 3 4 6 2
## 'data.frame': 2 obs. of 2 variables:
## $ V1: num 3 4
## $ V2: num 6 2
2.3 Exercises
Exercise 2.1 Create the following objects in R:
An integer with value 12
A double with value 12
A character string containing your name
A logical value indicating whether \(5 > 3\)
A date corresponding to 1st January 2023
Use str() to inspect that each object is created correctly.
Exercise 2.2 Create a character vector representing the following grades from an exam:
A,B,B,C,A,C,B
Convert this vector into a factor.
Inspect the structure and levels of the factor.
Change the order of the levels of these factors such that \(C<B<A\). Hint: Use the
?factor()query to see how to set levels.
Exercise 2.3 Using R, create the following matrices and vectors: \[ A=\left[ \begin{array}{ccc} 1 & 4 & 7 \\ 2 & 5 & 8 \\ 3 & 6 & 8 \end{array}\right] \qquad B=\left[ \begin{array}{ccc} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 8 \end{array}\right] \qquad D=\left[ \begin{array}{cc} 1 & 3 \\ 4 & 6 \\ 7 & 9 \end{array}\right] \qquad \vec{b}=\left[ \begin{array}{c} 3 \\ 6 \\ 9 \end{array}\right] \]
- Using the objects defined above, perform the following operations and check if the result is what you would expect:
\(A + B\) element-wise sum
\(A \times B\) element-wise multiplication
\(A \times B\) matrix multiplication
\(B \times D\) matrix multiplication
\(B \times \vec{b}\) matrix multiplication
Compute the transpose of matrix A and of matrix D.
Create a matrix with the elements 1, 2, 3, 4 in the main diagonal and zeros in the off diagonal elements.
Define a vector which consists of elements from the main diagonal of matrix B.
Build an identity matrix with dimension 10.
Compute the inverse of matrix A. Check if \(A \times A^{-1}=I_3\).
Find the solution \(\vec{x} = (x_1, x_2, x_3)^\top\), where \[ \left\{ \begin{array}{ccl} 6.5 &=& x_1 + x_2 + x_3 \\ 9 &=& 0.5 x_1 + 2 x_2 + x_3\\ 11 &=& 3 x_1 + x_2 + 2 x_3 \end{array}\right. \]
Combine the matrices A and D, horizontally.
Combine the matrix A and the transpose of vector b vertically.
Compute the mean for each row of matrix A. Do the same for each column of matrix A.
Compute the sum for each row of matrix B. Do the same for each column of matrix B.
Exercise 2.4 Create a data frame containing:
A numeric ID (1 to 7)
A character variable containing 7 names
A factor variable containing the 7 exam scores from the previous question
Inspect the structure of the data frame.
Exercise 2.5 Create a list containing:
A numeric vector
A character string with your name
The grade factors from the previous exercise(s)
A data frame from the previous exercise
Use str() to inspect the list.
2.4 Applied exercises
Exercise 2.6 A university department collects information on a small group of students enrolled in an introductory statistics course.
The following information is recorded for 8 students:
| Student | Department | Study Hours | Exam Grade |
|---|---|---|---|
| 1 | Economics | 12 | 65 |
| 2 | Economics | 15 | 70 |
| 3 | Maths | 18 | 78 |
| 4 | Physics | 10 | 60 |
| 5 | Maths | 20 | 85 |
| 6 | Economics | 8 | 55 |
| 7 | Physics | 14 | 72 |
| 8 | Maths | 16 | 80 |
- Create four seperate vectors corresponding to:
- Student ID
- Department
- Study Hours
- Exam Grade
Convert the department variable into a factor and inspect its structure
Combine the four vectors into a single data frame called
student. Give each column an appropriate nameUse
str()andsummmary()to explore the data frame you have createdCreate a new vector containing the study efficiency, defined as \[ \text{Efficiency} = \frac{\text{Exam Grade}}{\text{Study Hours}} \]
Add this new vector to the data frame as an additional column
Compute the mean exam score and mean study efficiency score across all students
Produce a scatter plot of Study Hours against Exam Grade and add:
- A title
- Axis Labels
- Briefly comment on the relationship you observe between study hours and exam performance