Vector in R

Introduction to R Data Structures
Learn how to create vectors in R programming with the c function

What is a vector in R programming language? Vectors are the most basic data structure in R. These structures allow to concatenate data of the same type. It should be noted that there are several ways to create a vector in R, such as joining two or more vectors, using sequences, or using random data generators.

What is a vector?

A vector is just a set of objects of the same type. You can create logical, character, numeric, complex or even factor vectors, among others. It is worth to mention that the different terms of the vector are called components. In addition, you can check the class of a vector with the class function and the type of the elements with the typeof function.

Create vector in R

Vectors in R can be created using the c function, that is used for object concatenation. You can save in memory a vector by assigning it a name with the <- operator.

# Creating R vectors with 'c' function
x <- c(12, 6, 67)
y <- c(2, 13)
y
2 13

Vectors can also be non-numeric. Hence, you can create vectors with characters, logical objects or other types of data objects.

state <- c("New York", "Ohio", "Washington", "Alabama")
class(state) # "character"

logic <- c(TRUE, TRUE, FALSE, TRUE)
class(logic) # "logical"

However, if you mix the data inside a vector the components will be coerced.

mix <- c(TRUE, "Correct", 8, 2.2)
mix # "TRUE" "Correct"  "8"  "2.2"

class(mix)  # "character"
typeof(mix) # "character"

Name vector in R

You can also name vector elements. For that purpose just choose a name for each component or just for some of them.

my_vector <- c(orange = 4, apple = 6)
my_vector
orange  apple
   4      6

In addition, if you have already created the vector you can use the setNames function as follows:

setNames(y, c("orange", "apple"))

Order vector in R

Sort function

For ordering or sorting a vector you can call the sort function passing the vector as argument. By default, the function sorts in ascending order.

z <- c(12, 15, 3, 22)
sort(z)
3 12 15 22

You can also sort data in decreasing order setting the decreasing argument to TRUE. Hence, we can call the following:

sort(z, decreasing = TRUE)
22 15 12 3

Order function

Alternatively, you can use brackets and order the vector components as an index making use of the order function.

# Increasing order
z[order(z)]  # Equivalent to sort(z)

# Decreasing order
z[order(-z)] # Equivalent to sort(z, decreasing = TRUE)

Reverse vector

You can reverse the order of a vector in R calling the rev function.

# Reversing the order of a vector
rev(z)
22 3 15 12

If you just need to change the order of a vector, using sort is more efficient.

Combine vectors

Combining two or more vectors is just easy as creating one. In fact, you just need to call the c function and pass the vectors as arguments, so you can add (append) a vector to other.

x <- c(1, 2, 3)
y <- c(4, 5, 6)
c(x, y)
1 2 3 4 5 6

It should be noted that the order of the components its relevant.

c(y, x)
4 5 6 1 2 3

Create empty vector

Sometimes you need to initialize an empty vector in R and fill it within a loop. Whatever your needs, you can use the c function without specifying arguments. You could also use the vector function.

# Empty vector
my_vector <- c()

# Filling the vector using a for loop
for(i in 1:10) {
  my_vector[i] <- i
}

my_vector
1  2  3  4  5  6  7  8  9  10

If you want to fill an empty vector, it is more efficient to pre-allocate memory creating a vector (for example with NA values) of the length of your final vector or using the vector function.

# Memory pre-allocation
my_vector <- rep(NA, 5)
my_vector <- vector(length = 5)

# Filling the vector using a for loop
for(i in 1:5) {
  my_vector[i] <- i
}

In this case, the difference will not be noticed, even so, for more expensive tasks, the reduction in execution time can be huge.

Compare two vectors

There are several ways to compare vectors in R. Firstly, you can compare the elements one by one with some logical operator. Note that if one vector is greater than the other, the number of elements must be multiples or an error will occur.

x <- c(1, 5)
y <- c(4, 0)
x > y # FALSE TRUE

x <- c(1, 5)
y <- c(4, 0, 1, 3)

# This compares 1 > 4, 5 > 0, 1 > 1 y 5 > 3
x > y # FALSE TRUE FALSE TRUE

x <- c(1, 5, 1)
y <- c(4, 0, 1, 3)
x > y # Error

Secondly, you can also check if the elements of the first vector are contained in the second with %in%.

x %in% y # TRUE FALSE

Thirdly, another option is to return the common elements between the first vector and the second:

# Return common elements
x[x %in% y] # 1

Finally, you could compare if all the elements of the first vector are in the second with the function all as follows:

x <- c(1, 5)
y <- c(4, 5, 1, 3)

all(x %in% y) # TRUE

Sequence vectors in R

In R, numeric sequences can be created in different ways. Among them, you can make use of the : operator or seq and rep functions.

1:4
# 1 2 3 4

seq(1, 4, 0.5)
# 1.0 1.5 2.0 2.5 3.0 3.5 4.0

seq(from = 1, to = 5, length.out = 9)
# 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0

rep(1, 5)
# 1 1 1 1 1

Generate a random vector in R

In R, there are several functions to deal with random number generation. The function sample allows you to create random sequences. In the following code we simulate 5 throws (sample size: 5) of a die (6 possible results).

sample(1:6, size = 5, replace = TRUE)

The replace argument indicates whether the throw is with or without replacement. This means that if we set replace = FALSE and we get a 5 in the first throw, in the next throw we can only obtain 1, 2, 3, 4 or 6.

You can also make use of the runif or rnorm functions, that generates random sequences of numbers through the Uniform and Normal distributions, respectively.

# Normal values
rnorm(5, mean = 0, sd = 1)
# -1.5611892 -0.2540665 -1.9912821 0.3040152 -1.4899171

# Uniform values
runif(5, min = 2, max = 10)
# 8.929246 8.610883 7.686587 7.495158 3.771902

Note on random number generation. When generating random numbers, you will obtain different values each time you execute the command, so your previous results will be different from ours. In order to set a random number generator seed to make a reproducible example you first need to call the set.seed function.

set.seed(1) # You can set any other integer number as seed

# Normal values
rnorm(5, mean = 0, sd = 1)
# -0.6264538  0.1836433  -0.8356286  1.5952808  0.3295078

# Uniform values
runif(5, min = 2, max = 10)
# 3.647797 3.412454 7.496183 5.072830 8.158731

Length of vector

You can get the length of a given vector with the length function. The length of a vector is the count of its components.

my_data <- c("vector", "sequence", "rnorm", "runif")

n <- length(my_data)

# Length of the vector
n # 4

Access elements of vectors in R

Accessing index elements allows you to access unique elements like the first or the last elements, subset the vector, replace, change or delete some elements of a vector. You can achieve this with numeric or logical indices.

Numeric index for accessing vector elements

In order to access the elements of a vector you can indicate inside brackets the corresponding vector subindex (positive integer).

When you access to ‘negative’ positions it is understood that you want to access all positions less those positions.

Consider, for instance, the letters given by the letters function.

lett <- letters
lett
"a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q"
"r" "s" "t" "u" "v" "w" "x" "y" "z"

In the following block code some examples are given for accessing different data.

# First element
lett[1]

# First element, simplifying the output class
lett[[1]]

# Third and fourth element
lett[c(3, 4)]

# Last element of vector
lett[length(lett)]

# Even letters
lett[seq(2, n, 2)]

# Odd letters
lett[seq(1, n, 2)]
lett[-seq(2, n, 2)] # Equivalent

Logical index for accessing vector elements

Other possibility is to use a logical vector. In this case, you will access to the positions with TRUE value. Let’s see an example with the maximum monthly temperature of a spanish city in 2017.

temp <- c(22.52, 18.70, 19.61, 22.79, 29.38, 30.19,
          33.16, 36.97, 33.29, 28.98, 24.31, 22.43)

month <- c("January", "February", "March", "April", "May", "June",
           "July", "August", "September", "October", "November", "December")

As an example, now you can look for the months with values greater than 30.

# Months with maximum temperature greater than 30
month[temp > 30] 
"June"  "July"  "August"  "September"

Note that the output of temp &gt; 30 is a logical vector.

FALSE FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE FALSE FALSE FALSE

Also, you can combine the logical conditions.

# Months with maximum temperature lower than 20 OR greater than 35
month[temp < 20 | temp > 35] 
"February"  "March"  "August"

Add element to R vector

Now you can try to add the ‘ñ’ letter to the vector we created before. First, you need to find the previous (or the following) letter in the alphabet. We will look for the ‘n’ letter and put the ‘ñ’ just after. You can make use of the which function to find the index of the element in the vector that corresponds with the letter ‘n’.

# Looking for the index
n1 <- which(lett == "n")
n1 # 14

With this in mind, with a single line of code you can concatenate the characters.

c(lett[1:n1], "ñ", lett[-(1:n1)])

In case you want to add the element at the beginning or at the end of the vector just use the c function in the corresponding order.

# Adding the letter 'ñ' at the beginning of the vector
c("ñ", lett)

# Adding the letter 'ñ' at the end of the vector
c(lett, "ñ")

How to delete a vector in R?

You can delete a vector in R with the rm function or assigning it other value, like NULL.

my_vector <- c(1, 2, 5, 6, 7)

# With rm function
rm(my_vector)

# Overriding the vector with other value
my_vector <- 0

# Assigning NULL
my_vector <- NULL

Delete value from vector

If you want to delete only some specific values of a vector you can use the - sign indicating the indexes you don’t want. Let’s see some examples.

vector <- c("London", "New York", "Paris")

# Deleting 'London'
vector[-1] # "New York" "Paris"
vector[which(vector != "London")]  # Equivalent
vector[-which(vector == "London")] # Equivalent