Matrix in R
A matrix in R is a data structure for storing objects of the same type. If you want to store different objects inside an R data structure, you must use a data frame instead. In this tutorial we are going to show you how to create matrices in R, how to label the columns and the rows with names and how to manipulate them.
How to create a matrix in R?
The matrix
function allows creating a matrix data structure in R programming language, passing a numeric, character or logical vector.
data <- 1:6
# Creating the matrix
matrix(data)
[, 1]
[1, ] 1
[2, ] 2
[3, ] 3
[4, ] 4
[5, ] 5
[6, ] 6
As you can observe in the output the function will create by default a matrix of one column and as many rows as the length of the vector. However, you can set the number of columns or the number of rows with the ncol
and nrow
arguments, respectively. Also, you can specify if the matrix is ordered by rows or by columns with the byrow
argument. By default, the function will order the input by columns.
# By columns
matrix(data, ncol = 2, byrow = FALSE) # byrow = FALSE by default
matrix(data, ncol = 2, nrow = 3) # Equivalent
matrix(data, nrow = 3) # Equivalent
# By rows
matrix(data, ncol = 2, byrow = TRUE)
# By columns # By rows
[, 1] [, 2] [, 1] [, 2]
[1, ] 1 4 [1, ] 1 2
[2, ] 2 5 [2, ] 3 4
[3, ] 3 6 [3, ] 5 6
If you have data stored in vectors or in the columns of a data frame, you can use the cbind
for column binding or rbind
for row binding and the output will be of class matrix
. Note that the output class can be checked with the class
function and the class of the elements with the typeof
function.
x <- c(2, 7, 3, 6, 1)
y <- c(3, 7, 3, 5, 9)
# By columns
cbind(x, y)
# By rows
rbind(x, y)
# Output class
class(cbind(x, y)) # "matrix"
# Data type of the elements
typeof(cbind(x, y)) # "double"
# By columns # By rows
x y [, 1] [, 2] [, 3] [, 4] [, 5]
[1, ] 2 3 x 2 7 3 6 1
[2, ] 7 7 y 3 7 3 5 9
[3, ] 3 3
[4, ] 6 5
[5, ] 1 9
Note you can use any data type inside a matrix, as long as they are homogeneous.
matrix(c(TRUE, TRUE, FALSE, TRUE), ncol = 2)
matrix(c("red", "green", "orange", "black"), ncol = 2)
[, 1] [, 2] [, 1] [, 2]
[1, ] TRUE FALSE [1, ] "red" "orange"
[2, ] TRUE TRUE [2, ] "green" "black"
Also, you can know the dimensions of your matrix in R programming with the dim
function.
my_matrix <- matrix(1:12, ncol = 2, byrow = FALSE)
# Matrix dimensions
dim(my_matrix) # 6 2
The first number of the output of the dim
function indicates the number of rows (6) and the second the number of columns (2).
The dim
function can also be used to create a matrix.
A <- c(3, 1, 6, 1, 2, 9)
dim(A) <- c(3, 2)
[, 1] [, 2]
[1, ] 3 1
[2, ] 1 2
[3, ] 6 9
The following table shows the most common functions related to matrices in R. In the next sections we will review some of them.
Function | Description |
---|---|
dim(), nrow(), ncol() | Number of rows/columns |
diag() | Diagonal of a matrix |
* | Element-wise multiplication |
%*% |
Matrix multiplication (dot product) |
%o% | Outer product |
%x% | Kronecker product |
cbind(), rbind() | Column/row bind |
t() | Transpose matrix |
solve(A) | Inverse of matrix \(A\) |
solve(A, b) | Solution to \(Ax=b\) |
eigen() | Eigenvalues and eigenvectors |
chol() | Cholesky decomposition |
qr() | QR decomposition |
svd() | Singular decomposition |
Add and delete column to matrix in R
As we show before, the cbind
function can be used to create a matrix. However, the main use of the function is to append columns to data structures. Nonetheless, to remove columns you can use the -
operator, indicating the index of the column in the second argument between brackets as in the example.
# Add column
A <- cbind(A, c(6, 1, 7))
# Add two columns
A <- cbind(A, c(6, 1, 7), c(1, 6, 1))
# Remove first column
A <- A[, -1]
# Remove first and third column
A <- A[, -c(1, 3)]
Add and delete row to matrix in R
Equivalently to the previous function, the rbind
function can be used to append rows to objects. You can delete rows the same way, but indicating the index in the first argument between brackets.
# Add row
A <- rbind(A, c(6, 1))
# Add row of fives
A <- rbind(A, 5)
# Remove second row
A <- A[-2, ]
Stack matrices in R
The rbind
function can also be used to stack or combine matrices:
x <- matrix(c(2, 7, 1, 3, 6, 1), ncol = 2, byrow = TRUE)
y <- matrix(c(3, 7, 6, 3, 5, 9), ncol = 2, byrow = TRUE)
# Stack matrices
rbind(x, y)
[, 1] [, 2]
[1, ] 2 7
[2, ] 1 3
[3, ] 6 1
[4, ] 3 7
[5, ] 6 3
[6, ] 5 9
In addition, if you create a list of matrices in R and you don’t know the final length of the list, you can use the do.call
function as follows to merge the two matrices:
matrix_list <- list(x, y)
matrix_list
[[1]]
[,1] [,2]
[1,] 2 7
[2,] 1 3
[3,] 6 1
[[2]]
[,1] [,2]
[1,] 3 7
[2,] 6 3
[3,] 5 9
do.call(rbind, matrix_list)
rbind(matrix_list[[1]], matrix_list[[2]]) # Equivalent
The previous code will return the same output as rbind(x, y)
but the difference is that with do.call
we don’t need to know the number of matrices to be concatenated. If the list contains more matrices inside it, the function will still work.
Add matrix row and column names
You can assign names to the rows and columns of a matrix with the rownames
and colnames
functions.
B <- matrix(c(4, 5, 1, 10, 8, 3),
nrow = 2, ncol = 3, byrow = TRUE)
# Set row names
rownames(B) <- c("Row 1", "Row 2")
rownames(B) <- paste0("Row ", 1:nrow(B)) # Equivalent
# Set column names
colnames(B) <- c("Column 1", "Column 2", "Column 3")
colnames(B) <- paste0("Column ", 1:ncol(B)) # Equivalent
B
Column 1 Column 2 Column 3
Row 1 4 5 1
Row 2 10 8 3
Note that you could rename the matrix columns and rows the same way.
Moreover, with the attributes
function you can access the dimension and the column and row labels of your matrices.
attributes(B)
$`dim`
[1] 2 3
$dimnames
$dimnames[[1]]
[1] "Row 1" "Row 2"
$dimnames[[2]]
[1] "Column 1" "Column 2" "Column 3"
If you only want to return your column and row names you can use the dimnames
function instead and access the elements of the list to get the row names or the column names.
dimnames(B)
[[1]]
[1] "Row 1" "Row 2"
[[2]]
[1] "Column 1" "Column 2" "Column 3"
Remove matrix row and column names
In case you are working with a named matrix and you want to get rid of the names, you can just delete the row or column names setting one to NULL
, or use the unname
function to delete all names.
# Remove column names
colnames(B) <- NULL
# Remove row names
rownames(B) <- NULL
# Remove row and column names
# with with one line
unname(B)
Access matrix elements in R
Accessing matrix elements is similar to access dataframes in R. The main use is my_matrix[rows, columns]
. However, there are some differences.
my_matrix <- matrix(c(1, 5, 8, 1, 3, 2), ncol = 3)
my_matrix
[, 1] [, 2] [, 3]
[1, ] 1 8 3
[2, ] 5 1 2
# First element of the first column
my_matrix[1] # 1
my_matrix[[1]] # Equivalent
my_matrix[1, 1] # Equivalent
# Second row, third column
my_matrix[2, 3] # 2
# First row
my_matrix[1, ] # 1 8 3
# Second column
my_matrix[, 2] # 8 1
# First and second column, first row
my_matrix[1, 1:2] # 1 8
# First and third column, second row
my_matrix[2, c(1, 3)] # 5 2
my_matrix[2, c(TRUE, FALSE, TRUE)] # Equivalent
# All columns except the second
my_matrix[, -2]
# Last row of matrix
my_matrix[nrow(my_matrix), ] # 5 1 2
# Last column of matrix
my_matrix[, ncol(my_matrix)] # 3 2
Note that when returning single rows or columns the output is a vector. If you want to avoid this, set drop = FALSE
.
my_matrix[1, , drop = FALSE]
[, 1] [, 2] [, 3]
[1, ] 1 8 3
Note that we left the second argument blank because we are selecting all columns.
Moreover, if the matrix has names, you can access the elements indexing them with the character names.
C <- matrix(c(5, 3, 2, 52, 34, 12), nrow = 2, ncol = 3, byrow = TRUE)
rownames(C) <- c("Row 1", "Row 2")
colnames(C) <- c("Column 1", "Column 2", "Column 3")
# First row, columns 1 and 3
C["Row 1", c("Column 1", "Column 3")]
Column 1 Column 3
5 2
Remove NA, NaN and Inf values from matrix
Sometimes you will need to deal with missing values. There are different types such that NA
(Not Available), NaN
(Not a Number) and Inf (Infinity) values. Note that you can delete the rows or columns containing this values or replace them with other values. Consider, for instance, the following matrix:
C <- matrix(c(14, NaN, 3, Inf, -5, 4, 1, NA), ncol = 4)
C
[, 1] [, 2] [, 3] [, 4]
[1, ] 14 3 -5 1
[2, ] NaN Inf 4 NA
You can remove the rows or the columns with non-finite values with the rowSums
or colSums
and is.finite
functions.
# Remove all rows with non-finite values
C[!rowSums(!is.finite(C)), ] # 14 3 -5 1
# Remove all columns with non-finite values
C[, !colSums(!is.finite(C)), drop = FALSE]
In case you want to replace the values, you can detect the non-finite values using !is.finite
and the NA
values with the is.na
function and then assign the values you want. In this case we are going to replace them with 0.
# Replace all NA with 0
C[is.na(C)] <- 0
C
[, 1] [, 2] [, 3] [, 4]
[1, ] 14 3 -5 1
[2, ] 0 Inf 4 0
# Replace all non-finite values with 0
C[!is.finite(C)] <- 0
C
[,1] [,2] [,3] [,4]
[1,] 14 3 -5 1
[2,] 0 0 4 0