Order rows in R with the arrange() function from dplyr
The arrange
function from dplyr is used to reorder rows in a data frame based on the values of one or more columns. It sorts the rows in ascending order by default and allows arranging in descending order by using the desc
function.
Sample data
In this tutorial we will use the following sample from the starwars
dataset from dplyr
containing 10 rows and four columns.
library(dplyr)
df <- starwars[1:10, c(1, 2, 3, 11)]
df
# A tibble: 10 × 4
name height mass species
<chr> <int> <dbl> <chr>
1 Luke Skywalker 172 77 Human
2 C-3PO 167 75 Droid
3 R2-D2 96 32 Droid
4 Darth Vader 202 136 Human
5 Leia Organa 150 49 Human
6 Owen Lars 178 120 Human
7 Beru Whitesun lars 165 75 Human
8 R5-D4 97 32 Droid
9 Biggs Darklighter 183 84 Human
10 Obi-Wan Kenobi 182 77 Human
Order by a single column
The arrange
function sorts rows by the specified column in ascending order by default. The following example demonstrates sorting the height
column in ascending order.
library(dplyr)
df <- starwars[1:10, c(1, 2, 3, 11)]
# Order by 'height' in ASCENDING order
df_2 <- df %>%
arrange(height)
df_2
# A tibble: 10 × 4
name height mass species
<chr> <int> <dbl> <chr>
1 R2-D2 96 32 Droid
2 R5-D4 97 32 Droid
3 Leia Organa 150 49 Human
4 Beru Whitesun lars 165 75 Human
5 C-3PO 167 75 Droid
6 Luke Skywalker 172 77 Human
7 Owen Lars 178 120 Human
8 Obi-Wan Kenobi 182 77 Human
9 Biggs Darklighter 183 84 Human
10 Darth Vader 202 136 Human
Arrange rows in descending order
If you want to sort the rows in descending order, you will have to place the sorting variable within the desc
function, as illustrated below.
library(dplyr)
df <- starwars[1:10, c(1, 2, 3, 11)]
# Order by 'height' in DESCENDING order
df_2 <- df %>%
arrange(desc(height))
df_2
# A tibble: 10 × 4
name height mass species
<chr> <int> <dbl> <chr>
1 Darth Vader 202 136 Human
2 Biggs Darklighter 183 84 Human
3 Obi-Wan Kenobi 182 77 Human
4 Owen Lars 178 120 Human
5 Luke Skywalker 172 77 Human
6 C-3PO 167 75 Droid
7 Beru Whitesun lars 165 75 Human
8 Leia Organa 150 49 Human
9 R5-D4 97 32 Droid
10 R2-D2 96 32 Droid
Sorting rows using custom functions
Rows can be sorted using custom functions. The following example sorts the name
column alphabetically, based on the first letter of the names.
library(dplyr)
df <- starwars[1:10, c(1, 2, 3, 11)]
# Order 'name' by the first letter of the name
df_2 <- df %>%
arrange(substr(name, 1, 1))
df_2
# A tibble: 10 × 4
name height mass species
<chr> <int> <dbl> <chr>
1 Beru Whitesun lars 165 75 Human
2 Biggs Darklighter 183 84 Human
3 C-3PO 167 75 Droid
4 Darth Vader 202 136 Human
5 Luke Skywalker 172 77 Human
6 Leia Organa 150 49 Human
7 Owen Lars 178 120 Human
8 Obi-Wan Kenobi 182 77 Human
9 R2-D2 96 32 Droid
10 R5-D4 97 32 Droid
Order rows by multiple columns
Rows can also be sorted by multiple columns. In this scenario, the sorting occurs sequentially: the first column is sorted, followed by the second, and so on. The example below demonstrates sorting rows based on the height
and mass
variables.
library(dplyr)
df <- starwars[1:10, c(1, 2, 3, 11)]
# Order by 'height' and then by 'mass'
df_2 <- df %>%
arrange(height, mass)
df_2
# A tibble: 10 × 4
name height mass species
<chr> <int> <dbl> <chr>
1 R2-D2 96 32 Droid
2 R5-D4 97 32 Droid
3 Leia Organa 150 49 Human
4 Beru Whitesun lars 165 75 Human
5 C-3PO 167 75 Droid
6 Luke Skywalker 172 77 Human
7 Owen Lars 178 120 Human
8 Obi-Wan Kenobi 182 77 Human
9 Biggs Darklighter 183 84 Human
10 Darth Vader 202 136 Human
Now, observe how the rows are ordered when you reverse the order of the specified sorting columns.
library(dplyr)
df <- starwars[1:10, c(1, 2, 3, 11)]
# Order by 'mass' and then by 'height'
df_2 <- df %>%
arrange(mass, height)
df_2
# A tibble: 10 × 4
name height mass species
<chr> <int> <dbl> <chr>
1 R2-D2 96 32 Droid
2 R5-D4 97 32 Droid
3 Leia Organa 150 49 Human
4 Beru Whitesun lars 165 75 Human
5 C-3PO 167 75 Droid
6 Luke Skywalker 172 77 Human
7 Obi-Wan Kenobi 182 77 Human
8 Biggs Darklighter 183 84 Human
9 Owen Lars 178 120 Human
10 Darth Vader 202 136 Human
Arrange rows by group
Lastly, it is important to note that you can sort variables within groups. To do this, you will need to group the data passing a categorical variable to group_by
and then specify .by_group = TRUE
inside arrange
.
library(dplyr)
df <- starwars[1:10, c(1, 2, 3, 11)]
# Order 'species' by 'height'
df_2 <- df %>%
group_by(species) %>%
arrange(mass, .by_group = TRUE)
df_2
# A tibble: 10 × 4
# Groups: species [2]
name height mass species
<chr> <int> <dbl> <chr>
1 R2-D2 96 32 Droid
2 R5-D4 97 32 Droid
3 C-3PO 167 75 Droid
4 Leia Organa 150 49 Human
5 Beru Whitesun lars 165 75 Human
6 Luke Skywalker 172 77 Human
7 Obi-Wan Kenobi 182 77 Human
8 Biggs Darklighter 183 84 Human
9 Owen Lars 178 120 Human
10 Darth Vader 202 136 Human