Sort data in R

Learn how to order data in R and use the sort and order functions

Sorting data in R language can be achieved in several ways, depending on how you want to sort or order your data. In this tutorial you will learn how to sort in R in ascending, descending or alphabetical order and how to order based on other vector in several data structures.

order() function in R

The R order function returns a permutation of the order of the elements of a vector. The syntax with summarized descriptions of the arguments is as follows:

order(x, # Sequence of vectors of the same length
      decreasing = FALSE, # Whether to sort in increasing or decreasing order
      na.last = TRUE,     # Whether to put NA values at the beginning or at the end
      method = c("auto", "shell", "radix")) # Method to be used. Defaults to auto


sort.list(x, # Atomic vector
          decreasing = FALSE,
          partial = NULL,    # Vector indices for partial sorting
          na.last = TRUE,
          method = c("auto", "shell", "quick", "radix"))

Note that the main difference between order and sort.list is that the first it is designed for more than one vector of the same length. However, it is common to use the order function with just one vector.

v <- c(34, 47, 25, 14)
order(v)  

# sort.list(v) # Equivalent
4 3 1 2

The output is an index vector that in this example means that if you want to sort the vector in ascending order, you have to put the fourth element first (14), then the third (25), then the first (34) and the greatest value is the second (47). If you set the decreasing argument to TRUE, you will have the vector of indices in descending order.

order(v, decreasing = TRUE)  
2 1 3 4

If the vector contains any NA values there will be at the end of the index vector by default.

vec <- c(24, 26, 2, 5, NA, 40, 12, NA)
order(vec)
3 4 7 1 2 6 5 8

In case you want the NA values to be displayed at the beginning you can set the na.last argument to FALSE. Note that if you prefer removing the NA values, remember to call the na.omit or use some similar approach.

order(vec, na.last = FALSE)
5 8 3 4 7 1 2 6

You can also use the order function with a character vector. Note that ordering a categorical variable means ordering it in alphabetical order.

sort() function in R

The sort function returns sorted, in ascending order by default, the vector you pass as input.

sort(x,                   # Atomic vector
      decreasing = FALSE, # Whether to sort in increasing or decreasing order
      na.last = TRUE,     # Whether to put NA values at the beginning or at the end
      ...)                # Additional arguments

sort.int(x,                  # Atomic vector and factor
         partial = NULL,     # Partial sorting indices vector
         decreasing = FALSE, # Same as above
         na.last = TRUE,     # Same as above
         method = c("auto", "shell", "radix"), # Method to be used. Defaults to auto
         index.return = FALSE, # Whether to return the ordering index vector or not
         ...)                  # Additional arguments

Difference between sort and order in R

It is usual to get confused between sort and order functions in R. On the one hand consider, for instance, the following vector and apply the order function to it:

my_vec <- c(1, 5.2, 22, 9, -5, 2)

ii <- order(my_vec)
ii
5 1 6 2 4 3

If you index the vector with the output of the order function you will obtain the initial vector sorted in ascending order:

my_vec[ii]
-5.0  1.0  2.0  5.2  9.0 22.0

On the other hand, the sort function will return by default the vector ordered in ascending order. However, you can also obtain the same result as the one with the order function if you set the argument index.return to TRUE.

sort(my_vec, index.return = TRUE)
$`x`
[1] -5.0  1.0  2.0  5.2  9.0 22.0

$ix
[1] 5 1 6 2 4 3

Order vector in R

There are three different ways of ordering a vector: in ascending order, in descending order or based on the index of other vector of the same length. In this section we are going to use the following sample vector:

x <- c(56, 14, 1, 28)

Note that when working with a large vector you can use the is.unsorted function to verify if the vector is sorted or not, instead of visually check the order.

is.unsorted(x) # TRUE

Ascending order

Sorting in ascending order means that the values will be ordered from lower to higher. For that purpose, you can use the order and sort functions as follows:

x[order(x)]

# Equivalent to:
ii <- order(x)
x[ii]

# Equivalent to:
sort(x)
1 14 28 56

Descending order

Sorting a vector in descending order means ordering the elements from higher to lower. Hence, you can order the opposite of the vector (with the minus sign) or setting the argument decreasing = TRUE as follows:

x[order(-x)]

# Equivalent to:
x[order(x, decreasing = TRUE)]

# Equivalent to:
sort(x, decreasing = TRUE)
56 28 14 1

Order by other vector

You can order some vector using other of the same length as the index vector. In the following example the vector y indicates that the second element of the x vector (14) must be the first, the third (1) the second, the first (56) the third and finally the fourth (28).

y <- c(2, 3, 1, 4)
x[y]
14 1 56 28

Moreover, you could also order the vector x by the index vector of the vector y.

# order(y) # 3 1 2 4
x[order(y)]
 1 56 14 28

Order data frame or matrix in R

When working with a matrix or a data frame in R you could want to order the data by row or by column values. Note that although we are going to use a data frame as an example, the explanations are equivalent to the case of matrices. In order to explain how to sort a data frame in R we are going to use the attitude dataset from base R.

my_df <- attitude[, c(2, 3, 4)]
head(my_df)
     complaints  privileges  learning
1         51          30        39
2         64          51        54
3         70          68        69
4         63          45        47
5         78          56        66
6         55          49        44

Sort dataframe by column

Suppose you want to order the data frame by the privileges column in ascending order. Consequently, you could type:

# Order by privileges column
ordered_df <- my_df[order(my_df$privileges), ]

# Show first rows
head(ordered_df)
      complaints  privileges  learning
1          51         30        39
21         40         33        34
30         82         39        59
7          67         42        56
24         37         42        58
25         54         42        48

Note that in case of ties, the order is based on the index of the rows.

In addition, in case you need sorting your data frame by multiple columns, specify more columns inside the order function. This is very useful when the main column you are ordering has ties.

# Order by 'privileges' column and then by 'complaints' column
ordered_df <- my_df[order(my_df$privileges, my_df$complaints), ]

# First rows
head(ordered_df)
   complaints privileges learning
1          51         30       39
21         40         33       34
30         82         39       59
24         37         42       58      # <- Note the difference
25         54         42       48      # with the previous output
7          67         42       56

Note that the complaints column is now sorted for those values ​​where the privileges column has ties.

Change order of rows and columns

You can change the order of columns in R modifying the order of the index that defines the columns. Apart from this, you can also reverse the order with a sequence from the number of columns of the data frame to 1.

# Custom order of columns
my_df[, c(2, 1, 3)] 

# Reverse order of columns
my_df[, ncol(my_df):1]

Equivalently, you can modify the order of rows:

# Custom order of rows (random)
my_df[sample(nrow(my_df), replace = FALSE), ]

# Reverse order of rows
my_df[nrow(my_df):1, ]

Sort rows alphabetically

Consider the following sample data frame, where each row has been randomly named with a letter.

set.seed(4)
my_df <- data.frame(x = 1:10, y = 12:21)
rownames(my_df) <- sample(letters, nrow(my_df))
my_df
   x  y
p  1 12
a  2 13
h  3 14
g  4 15
r  5 16
f  6 17
o  7 18
v  8 19
s  9 20
b 10 21

You can order the rows alphabetically with the order and rownames functions as follows:

my_df[order(rownames(my_df)), ]
   x  y
a  2 13
b 10 21
f  6 17
g  4 15
h  3 14
o  7 18
p  1 12
r  5 16
s  9 20
v  8 19

Sort list in R

In this section you will learn how to sort a list in R. There are three ways for ordering a list in R: sorting the elements in alphabetical order, creating a custom order, or ordering a specific list element. Consider, for instance, the following sample list:

my_list <- list(b = 1:10, a = letters[1:5], c = matrix(1:2, ncol = 2))
my_list
$`b`
 [1]  1  2  3  4  5  6  7  8  9 10

$a
[1] "a" "b" "c" "d" "e"

$c
     [,1] [,2]
[1,]    1    2

You can order the elements of the list alphabetically using the order and names functions as follows:

# Order elements alphabetically
my_list[order(names(my_list))]
$`a`
[1] "a" "b" "c" "d" "e"

$b
 [1]  1  2  3  4  5  6  7  8  9 10

$c
     [,1] [,2]
[1,]    1    2

If preferred, you can manually create a custom order specifying the names or the index of the elements inside the c function.

# Custom sorting
my_list[c("b", "c", "a")]
my_list[c(1, 3, 2)] # Equivalent
$`b`
 [1]  1  2  3  4  5  6  7  8  9 10

$c
     [,1] [,2]
[1,]    1    2

$a
[1] "a" "b" "c" "d" "e"

Finally, it could be interesting to order a list element. In the following case, the sorting will be the same as sorting a vector.

# Order list element
sort(my_list$b, decreasing = TRUE)

How to sort categorical data in R?

You can order character or categorical data in R in different ways. Consider the following categorical variable:

set.seed(1)
categorical_data <- rownames(mtcars)[sample(10)]
categorical_data
"Datsun 710" "Hornet 4 Drive" "Hornet Sportabout" "Duster 360" "Mazda RX4 Wag"
"Merc 240D" "Merc 230" "Valiant" "Merc 280" "Mazda RX4" 

In this scenario you can make use of the sort function to sort the variable in alphabetical order, as we reviewed in the section about ordering vectors. If the variable contains character numbers, they will also be ordered correctly.

sort(categorical_data)

An alternative to order a categorical variable alphabetically in R is converting it to a factor and sorting it.

sort(factor(categorical_data))
Datsun 710        Duster 360        Hornet 4 Drive    Hornet Sportabout
Mazda RX4         Mazda RX4 Wag     Merc 230          Merc 240D        
Merc 280          Valiant          
Levels: Datsun 710 Duster 360 Hornet 4 Drive Hornet Sportabout ... Valiant

However, if you want to return the index when ordering factors in R, you will need to use the sort.int function to use the index.return argument.

sort.int(factor(categorical_data), index.return = TRUE)
$`x`
 [1] Datsun 710        Duster 360        Hornet 4 Drive    Hornet Sportabout
 [5] Mazda RX4         Mazda RX4 Wag     Merc 230          Merc 240D        
 [9] Merc 280          Valiant          
10 Levels: Datsun 710 Duster 360 Hornet 4 Drive Hornet Sportabout ... Valiant

$ix
 [1]  1  4  2  3 10  5  7  6  9  8