Home » Introduction » Matrix in R

# Matrix in R

## How to create a matrix in R?

The matrix function allows creating a matrix data structure in R programming language, passing a numeric, character or logical vector.

data <- 1:6

# Creating the matrix
matrix(data)
      [, 1]
[1, ]    1
[2, ]    2
[3, ]    3
[4, ]    4
[5, ]    5
[6, ]    6

As you can observe in the output the function will create by default a matrix of one column and as many rows as the length of the vector. However, you can set the number of columns or the number of rows with the ncol and nrow arguments, respectively. Also, you can specify if the matrix is ordered by rows or by columns with the byrow argument. By default, the function will order the input by columns.

# By columns
matrix(data, ncol = 2, byrow = FALSE) # byrow = FALSE by default
matrix(data, ncol = 2, nrow = 3) # Equivalent
matrix(data, nrow = 3) # Equivalent

# By rows
matrix(data, ncol = 2, byrow = TRUE)
# By columns               # By rows
[, 1] [, 2]                 [, 1] [, 2]
[1, ]   1    4             [1, ]    1    2
[2, ]   2    5             [2, ]    3    4
[3, ]   3    6             [3, ]    5    6

If you have data stored in vectors or in the columns of a data frame, you can use the cbind for column binding or rbind for row binding and the output will be of class matrix. Note that the output class can be checked with the class function and the class of the elements with the typeof function.

x <- c(2, 7, 3, 6, 1)
y <- c(3, 7, 3, 5, 9)

# By columns
cbind(x, y)

# By rows
rbind(x, y)

# Output class
class(cbind(x, y))  # "matrix"

# Data type of the elements
typeof(cbind(x, y)) # "double"
# By columns                # By rows
x   y                    [, 1] [, 2] [, 3] [, 4] [, 5]
[1, ] 2   3                 x    2     7     3     6     1
[2, ] 7   7                 y    3     7     3     5     9
[3, ] 3   3
[4, ] 6   5
[5, ] 1   9

Note you can use any data type inside a matrix, as long as they are homogeneous.

matrix(c(TRUE, TRUE, FALSE, TRUE), ncol = 2)
matrix(c("red", "green", "orange", "black"), ncol = 2)
     [, 1]  [, 2]              [, 1]   [, 2]
[1, ] TRUE FALSE         [1, ] "red"   "orange"
[2, ] TRUE  TRUE         [2, ] "green" "black"

Also, you can know the dimensions of your matrix in R programming with the dim function.

my_matrix <- matrix(1:12, ncol = 2, byrow = FALSE)

# Matrix dimensions
dim(my_matrix) # 6 2
The first number of the output of the dim function indicates the number of rows (6) and the second the number of columns (2).

The dim function can also be used to create a matrix.

A <- c(3, 1, 6, 1, 2, 9)
dim(A) <- c(3, 2)
      [, 1] [, 2]
[1, ]    3     1
[2, ]    1     2
[3, ]    6     9

The following table shows the most common functions related to matrices in R. In the next sections we will review some of them.

### Add and delete column to matrix in R

As we show before, the cbind function can be used to create a matrix. However, the main use of the function is to append columns to data structures. Nonetheless, to remove columns you can use the - operator, indicating the index of the column in the second argument between brackets as in the example.

# Add column
A <- cbind(A, c(6, 1, 7))

A <- cbind(A, c(6, 1, 7), c(1, 6, 1))

# Remove first column
A <- A[, -1]

# Remove first and third column
A <- A[, -c(1, 3)]

### Add and delete row to matrix in R

Equivalently to the previous function, the rbind function can be used to append rows to objects. You can delete rows the same way, but indicating the index in the first argument between brackets.

# Add row
A <- rbind(A, c(6, 1))

A <- rbind(A, 5)

# Remove second row
A <- A[-2, ]

### Stack matrices in R

The rbind function can also be used to stack or combine matrices:

x <- matrix(c(2, 7, 1, 3, 6, 1), ncol = 2, byrow = TRUE)
y <- matrix(c(3, 7, 6, 3, 5, 9), ncol = 2, byrow = TRUE)

# Stack matrices
rbind(x, y)
     [, 1] [, 2]
[1, ]    2    7
[2, ]    1    3
[3, ]    6    1
[4, ]    3    7
[5, ]    6    3
[6, ]    5    9

In addition, if you create a list of matrices in R and you don’t know the final length of the list, you can use the do.call function as follows to merge the two matrices:

matrix_list <- list(x, y)
matrix_list
[[1]]
[,1] [,2]
[1,]    2    7
[2,]    1    3
[3,]    6    1

[[2]]
[,1] [,2]
[1,]    3    7
[2,]    6    3
[3,]    5    9
do.call(rbind, matrix_list)
rbind(matrix_list[[1]], matrix_list[[2]]) # Equivalent

The previous code will return the same output as rbind(x, y) but the difference is that with do.call we don’t need to know the number of matrices to be concatenated. If the list contains more matrices inside it, the function will still work.

### Add matrix row and column names

You can assign names to the rows and columns of a matrix with the rownames and colnames functions.

B <- matrix(c(4, 5, 1, 10, 8, 3),
nrow = 2, ncol = 3, byrow = TRUE)

# Set row names
rownames(B) <- c("Row 1", "Row 2")
rownames(B) <- paste0("Row ", 1:nrow(B)) # Equivalent

# Set column names
colnames(B) <- c("Column 1", "Column 2", "Column 3")
colnames(B) <- paste0("Column ", 1:ncol(B)) # Equivalent
B
      Column 1 Column 2 Column 3
Row 1        4        5        1
Row 2       10        8        3

Note that you could rename the matrix columns and rows the same way.

Moreover, with the attributes function you can access the dimension and the column and row labels of your matrices.

attributes(B)
$dim [1] 2 3$dimnames
$dimnames[[1]] [1] "Row 1" "Row 2"$dimnames[[2]]
[1] "Column 1" "Column 2" "Column 3"

If you only want to return your column and row names you can use the dimnames function instead and access the elements of the list to get the row names or the column names.

dimnames(B)
[[1]]
[1] "Row 1" "Row 2"

[[2]]
[1] "Column 1" "Column 2" "Column 3"

### Remove matrix row and column names

In case you are working with a named matrix and you want to get rid of the names, you can just delete the row or column names setting one to NULL, or use the unname function to delete all names.

# Remove column names
colnames(B) <- NULL

# Remove row names
rownames(B) <- NULL

# Remove row and column names
# with with one line
unname(B)

## Access matrix elements in R

Accessing matrix elements is similar to access dataframes in R. The main use is my_matrix[rows, columns]. However, there are some differences.

my_matrix <- matrix(c(1, 5, 8, 1, 3, 2), ncol = 3)
my_matrix
      [, 1] [, 2] [, 3]
[1, ]    1    8     3
[2, ]    5    1     2
# First element of the first column
my_matrix[1]    # 1
my_matrix[[1]]  # Equivalent
my_matrix[1, 1] # Equivalent

# Second row, third column
my_matrix[2, 3] # 2

# First row
my_matrix[1, ] # 1 8 3

# Second column
my_matrix[, 2] # 8 1

# First and second column, first row
my_matrix[1, 1:2] # 1 8

# First and third column, second row
my_matrix[2, c(1, 3)] # 5 2
my_matrix[2, c(TRUE, FALSE, TRUE)] # Equivalent

# All columns except the second
my_matrix[, -2]

# Last row of matrix
my_matrix[nrow(my_matrix), ] # 5 1 2

# Last column of matrix
my_matrix[, ncol(my_matrix)] # 3 2

Note that when returning single rows or columns the output is a vector. If you want to avoid this, set drop = FALSE.

my_matrix[1, , drop = FALSE]
      [, 1] [, 2] [, 3]
[1, ]    1    8    3
Note that we left the second argument blank because we are selecting all columns.

Moreover, if the matrix has names, you can access the elements indexing them with the character names.

C <- matrix(c(5, 3, 2, 52, 34, 12), nrow = 2, ncol = 3, byrow = TRUE)

rownames(C) <- c("Row 1", "Row 2")
colnames(C) <- c("Column 1", "Column 2", "Column 3")

# First row, columns 1 and 3
C["Row 1", c("Column 1", "Column 3")]
Column 1 Column 3
5        2 

## Remove NA, NaN and Inf values from matrix

Sometimes you will need to deal with missing values. There are different types such that NA (Not Available), NaN (Not a Number) and Inf (Infinity) values. Note that you can delete the rows or columns containing this values or replace them with other values. Consider, for instance, the following matrix:

C <- matrix(c(14, NaN, 3, Inf, -5, 4, 1, NA), ncol = 4)
C
      [, 1] [, 2] [, 3] [, 4]
[1, ]   14     3    -5     1
[2, ]  NaN   Inf     4    NA

You can remove the rows or the columns with non-finite values with the rowSums or colSums and is.finite functions.

# Remove all rows with non-finite values
C[!rowSums(!is.finite(C)), ] # 14  3 -5  1

# Remove all columns with non-finite values
C[, !colSums(!is.finite(C)), drop = FALSE]

In case you want to replace the values, you can detect the non-finite values using !is.finite and the NA values with the is.na function and then assign the values you want. In this case we are going to replace them with 0.

# Replace all NA with 0
C[is.na(C)] <- 0
C
    [, 1] [, 2] [, 3] [, 4]
[1, ]   14    3   -5    1
[2, ]    0  Inf    4    0
# Replace all non-finite values with 0
C[!is.finite(C)] <- 0
C
     [,1] [,2] [,3] [,4]
[1,]   14    3   -5    1
[2,]    0    0    4    0