Chapter5 Loops

5.1 For loops

‘For loops’, sometimes just known as ‘Loops’ are one of the most useful tools in programming and you will find, once you understand how to implement them, that they become your best friends. That being said, it is very common that people like them so much that they are used when they are not necessary, as we will see later.

Simply put, a ‘for loop’ allows us to ‘loop’ through all the elements of a given object (usually a vector or matrix) and perform a command or operation for each element. When combined with ‘IF statements’, ‘for loops’ become very powerful and flexible and allow you to perform almost any task.

Let us start by understanding how a basic ‘for loop’ is constructed, then we will consider some simple examples. The general form of a for loop is as follows:

for (i in x) {
command in terms of i
}

That is, i will take the first value of the object x, perform the command in the brackets with this given value of i, then i will loop to the second value of x and so on. For example:

for (i in c(1,2,3,4,5)) {
  print(i^2)
}
## [1] 1
## [1] 4
## [1] 9
## [1] 16
## [1] 25

This works perfectly but notice that we could also do this using what we called ‘vectorised calculation’, which takes advantage of how R deals with vectors on an element-by-element basis:

(1:5)^2
## [1]  1  4  9 16 25

As another example, consider the following:

(x <- seq(from = 10, to = 100, by = 5))
##  [1]  10  15  20  25  30  35  40  45  50  55  60  65  70  75  80  85  90  95 100
for (i in x){
  print(i %% 2 == 0)
}
## [1] TRUE
## [1] FALSE
## [1] TRUE
## [1] FALSE
## [1] TRUE
## [1] FALSE
## [1] TRUE
## [1] FALSE
## [1] TRUE
## [1] FALSE
## [1] TRUE
## [1] FALSE
## [1] TRUE
## [1] FALSE
## [1] TRUE
## [1] FALSE
## [1] TRUE
## [1] FALSE
## [1] TRUE

Again, was this necessary or could we have used vectorised calculations again? If possible, you should always use the vectorised calculation version of a command as this saves times and processing power. That being said,there are many situations where ‘for loops’ are necessary, not just useful.

Let us return to our mtcars data set seen in the previous chapter and consider a problem regarding plotting histograms of the data:

hist(mtcars$hp)

hist(mtcars[,4])

Now, imagine you wanted a histogram for every variable. Executing the code hist(mtcars) wouldn’t work as the input necessary for this function should be in the form of a single vector of values. However, to overcome this hurdle, we could make use of ‘for loops’:

for (i in 1:ncol(mtcars)){
  hist(mtcars[,i], main = paste("Histogram of", colnames(mtcars)[i]))
}

The above is great, but it would be nice to have them all on one screen together. Note that the code below is not really linked to for loops but is still worth mentioning here.

par(mfrow = c(3,4)) # Changes the plot frame to fit 3 rows and 4 columns of separate plots.
for(i in 1:ncol(mtcars)){
  hist(mtcars[,i])
}

This is much better but I would like the individual titles and axis labels to reflect the variable name:

par(mfrow = c(3,4))
for(i in 1:ncol(mtcars)){
  hist(mtcars[,i], main = paste("Histogram of", colnames(mtcars)[i]), xlab = paste(colnames(mtcars)[i]))
}

Even this very simply example starts to show you the value and versatility of for loops.

Now, as mentioned above, it is also possible to combine ‘for loops’ with IF statements. For example, the code below counts the number of even numbers in a vector of values:

x <- c(2,5,3,9,8,11,6)

count <- 0
for (i in x) {
  if(i %% 2 == 0){
    count <- count+1
  }
}
print(count)
## [1] 3

Exercise 5.1 Is there a quicker and easier way to achieve what has been done above without ‘for loops’?

Exercise 5.2 Can you write a ‘for loop’ that prints out the names of the cars in the mtcars data set which have 8 cylinders? Note, the car names can be found using the rownames(mtcars) command.

5.1.1 Matrices

So far, we have seen how we can loop through a vector of values to perform certain tasks, but it is also possible to do this over a matrix of values. The only difference is that this requires two loops (one for each index - row and column). For example:

(M <- matrix(round(runif(9,min = 0, max = 100)), nrow = 3, ncol = 3)) # This creates a 3x3 matrix of rounded uniform random values.
##      [,1] [,2] [,3]
## [1,]   44   86   46
## [2,]   20   98   35
## [3,]   56   34   26
for(i in 1:nrow(M)){
  for(j in 1:ncol(M)){
    print(paste("Element [", i,",",j,"] of M is equal to",M[i,j]))
    }
}
## [1] "Element [ 1 , 1 ] of M is equal to 44"
## [1] "Element [ 1 , 2 ] of M is equal to 86"
## [1] "Element [ 1 , 3 ] of M is equal to 46"
## [1] "Element [ 2 , 1 ] of M is equal to 20"
## [1] "Element [ 2 , 2 ] of M is equal to 98"
## [1] "Element [ 2 , 3 ] of M is equal to 35"
## [1] "Element [ 3 , 1 ] of M is equal to 56"
## [1] "Element [ 3 , 2 ] of M is equal to 34"
## [1] "Element [ 3 , 3 ] of M is equal to 26"

Another very important technique that you will need when working with ‘for loops’ is how to store values in a new vector (matrix) as you finish each loop. As a simple example let us see how we could use a ‘for loop’ to generate some random values and save them in a vector if they satisfy some condition.

Before we start, let us note how you can add a value to an already existing vector

(x <- c(1, 3, 5, 7, 9))
## [1] 1 3 5 7 9
(x <- c(x, 11))
## [1]  1  3  5  7  9 11

In the above line of code, x has been over-written as the vector which contain all the values of the original vector x but then also includes 11 as well. This idea of over-writing a given value using itself is a very common technique:

vec <- c() # Create an initially empty vector
for (i in 1:20){
  rand <- rnorm(1, mean = 0, sd = 1) # This generates a standard normal random variable
  if(rand > 0){
    vec <- c(vec,rand)
  }
}
vec
##  [1] 0.005512954 1.498507015 0.293155615 0.276509132 2.039636613 1.849125749
##  [7] 0.807890904 0.080278717 0.099427783 1.078409010 2.023306983 0.207384631
## [13] 0.515240159

Alternatively, you could save each value as a particular element in the vector:

vec <- c()
vec
## NULL
for (i in 1:20){
  rand <- rnorm(1, mean = 0, sd = 1)
  if(rand > 0){
    vec[i] <- rand
  }
}
vec
##  [1]        NA        NA        NA 0.1392454 1.2637265        NA 0.7050683
##  [8]        NA 0.9163221 0.2384281 1.0067351        NA        NA 0.4972835
## [15] 0.3291732 2.0054633        NA        NA        NA 0.3183235

In fact, you could have easily set this up to store all the values in a matrix rather than a vector

(mat <- matrix(c(rep(NA, 16)), nrow = 4)) # Empty matrix
##      [,1] [,2] [,3] [,4]
## [1,]   NA   NA   NA   NA
## [2,]   NA   NA   NA   NA
## [3,]   NA   NA   NA   NA
## [4,]   NA   NA   NA   NA
for (i in 1:4){
  for (j in 1:4){
  rand <- rnorm(1, mean = 0, sd = 1)
  if(rand > 0){
    mat[i,j] <- rand
  }
  }
}
mat
##           [,1]      [,2]     [,3]     [,4]
## [1,] 0.6653966        NA       NA 1.369854
## [2,] 0.5747815 0.2128382       NA       NA
## [3,]        NA        NA       NA       NA
## [4,]        NA 0.5684018 1.056248       NA

5.2 While loops

The final type of loop we will consider is the so-called ‘WHILE loop’. A While loop is similar to a for loop but instead of looping through different values of a specified vector (i in 1:10) it will continue to loop whilst a certain condition holds and will only stop when this condition is no longer satisfied. For example:

i <- 1
while (i < 6) {
  print(i)
  i <- i+1
}
## [1] 1
## [1] 2
## [1] 3
## [1] 4
## [1] 5

WARNING - Be very careful when using while loops. If you do not write them correctly they can result in your code running infinitely. As an example, try seeing what happens to the code above if you forget to increment i by one each time.

While loops are very helpful when the number of loops required is unknown. For example, imagine we wanted to find the smallest integer for which the sum of all positive integers up to this value was greater than 1000. This can easily be done using a while loop:

i <- 1
sum <- 0

while(sum < 1000){
  sum <- sum + i
  if (sum < 1000){
  i <- i + 1
  } else {
    print(i)
  }
}
## [1] 45
sum(1:44)
## [1] 990
sum(1:45)
## [1] 1035

Exercise 5.3 Create a variable called speed and assign this a rounded random uniform distributed value between 50 - 60, i.e. round(runif(1, 50, 60)). Using a while loop, create a code that prints “Your speed is ?? - Slow Down” if speed is greater than 30 then take off a random speed of rnorm(1, 5, 1). Once speed is less than or equal to 30 it should print out “Your speed is ?? - Thank you for not speeding”.

I appreciate this is a lot to take in for those who are not familiar with programming but I assure these ideas become second nature with a little practice. However, for now, I highly recommend that you complete the exercises in DataCamp on conditional statements and loops (Intermediate R) for extra practice.

There are other versions and common commands used in loops, namely break, next and repeats, but I will leave these for you to explore yourself - you will need these for the exercises below.

5.3 Exercises

Exercise 5.4 A for() loop is often used when we know exactly how many times we want something to repeat.

  1. Create a variable called total and assign it the value 0

  2. Write a for() loop that loopf over the numbers 1 to 5. Inside the loop, you should:

  • Print the current number of total
  • Add the current value of the loop index to total
  1. After the loop finishes, print the final value of total. Think about your result and make sure you understand how it worked.

Exercise 5.5 Using for() loops, generate and print the first 20 values of the famous Fibonacci sequence (starting with \(0, 1\)). Recall, the Fibonacci sequence is obtained by evaluating the next number in the sequence as the sum of the previous two numbers in the sequence.


Exercise 5.6 Altering your code in the previous question, use a while() loop to determine how many values the Fibonacci sequence contains before its value exceeds 100,000.


Exercise 5.7 Use a while() loop to determine the smallest value of \(x\) such that \[\begin{equation*} \prod_{n=1}^x n > 10^{6}. \end{equation*}\]


Exercise 5.8 Using for() loops, create and fill a 5x5 matrix where each entry \(a_{ij}\) is given by \[ a_{ij} = i + j. \] i. Start by creating an empty 5x5 matrix

  1. Use nested for() loops to fill the matrix using the rule above

  2. Print the final matrix.


Exercise 5.9 An insurance company starts with an initial capital of 100 units.

Each year:

  • the company earns a fixed premium income of 12 units

  • it must pay claims of 15 units

  • So its capital decreases by 3 units per year.

The company is considered ruined once its capital is less than or equal to 0.

  1. Create a variable capital <- 100

  2. Create a counter variable year <- 0

  3. Use a repeat() loop to simulate the company’s capital year by year. Inside the loop:

  • Increase year by 1
  • Update the capital
  • Print the current year and capital
  1. Use break to stop the loop once ruin occurs and print out the number of years taken to become ruined.

  2. Could we have used a while() loop to do this instead? What is the main difference between a while() loop and a repeat() loop?


5.4 Applied exercises

Exercise 5.10 An insurance company starts with reserves of £100.

Each day:

  • the company earns £2 in premium income
  • it pays a claim equal to 5% of its current reserves

Thus, if we denote the reserve by \(R\), the update rule is \[ R_{new} = R + 2 - 0.05R. \] The company stops operating if:

  • reserves fall below £40 (danger zone), OR
  • reserves exceed £200 (strong surplus).
  1. Create variables reserve <- 100 and day <- 0. Use a while() or repeat() loop to simulate the reserve process day by day until it exits the safe region (£40 - £200). Print the reserve at each step.

  2. How many days does it take for the reserve to exit the safe region?

  3. Store the reserve values is a vector and plot the reserve path over time using plot()

  4. Repeat the simulation using different starting reserves:

\[ \{60, 80, 100, 120, 150\} \]

Use a for() loop to run each simulation for each starting value and record the number of days until exit. Which starting reserve survives the longest?

  1. Plot all of the reserves for the different starting values on the same plot.