Lowercase and uppercase in R with tolower() and toupper()

Data Manipulation in R String manipulation
Lowercase and uppercase in R with tolower(), toupper() and chartr()

R provides a set of functions for casefolding, that is, to transform strings to uppercase and to lowercase, such as toupper, tolower, casefold. In addition we will review the chartr function, used to perform specific string transformations.

To uppercase with toupper

toupper is the base R function to transform any string or character vector to uppercase. The following example illustrates how to transform a string to uppercase:

x <- "this is a sample text"

# Transform to upper case with toupper()
x <- toupper(x)
x
"THIS IS A SAMPLE TEXT"

Column and column names to uppercase

It is possible to convert character columns, column names, row names of a data frame or other objects to uppercase with toupper. For instance, to transform a character column from lowercase to uppercase you just need to assign the new value to the original column, as shown below.

# Sample data frame
df <- data.frame(x = 1:5, y = c("a", "b", "c", "d", "e"))

# Transform column to upper case
df$y <- toupper(df$y)
df
  x y
1 1 A
2 2 B
3 3 C
4 4 D
5 5 E

For column or row names you will need to apply the toupper functon to column or row names of the data frame.

# Sample data frame
df <- data.frame(x = 1:5, y = c("a", "b", "c", "d", "e"))

# Transform column names to upper case
colnames(df) <- toupper(colnames(df))
df
  X Y
1 1 a
2 2 b
3 3 c
4 4 d
5 5 e

First letter to uppercase

When working with names or similar data it is usual to want the first letter as capital letter. To achieve it you can extract the first and the rest of the letters, convert the first letter to uppercase and finally paste it together:

# Sample vector
x <- c("john", "Kate", "Stan")

# Transform first letter to upper case
x <- paste0(toupper(substr(x, 1, 1)), substring(x, 2))
x
"John" "Kate" "Stan"

In this example, substr(x, 1, 1) extracts the first letter of each name, toupper converts it to uppercase, and then paste0 is used to combine the uppercase first letter with the rest of the name extracted with substring(x, 2).

An alternative to the previous example is to use str_to_title from stringr, but this function behaves different, as str_to_title will transform all letters to lowercase except the first.

# install.packages("stringr")
library(stringr)

# Sample vector
x <- c("jOhn", "kaTE", "StaN")

# Transform to title case 
# only first letter in uppercase even if
# input contains other letters in uppercase
x <- stringr::str_to_title(x)
x
"John" "Kate" "Stan"

To lowercase with tolower

The function used to transform character vectors into lowercase is named tolower. The following example demonstrates the basic usage of this function:

x <- "THIS IS A SAMPLE TEXT"

# Transform to lower case with tolower()
x <- tolower(x)
x
"this is a sample text"

Column and column names to lowercase

The tolower function can also be applied to character columns of a data frame, as shown below:

# Sample data frame
df <- data.frame(x = 1:5, y = c("A", "B", "C", "D", "E"))

# Transform column to lower case
df$y <- tolower(df$y)
df
  x y
1 1 a
2 2 b
3 3 c
4 4 d
5 5 e

In addition, if you want to transform column names from uppercase to lowercase you could do the following:

# Sample data frame
df <- data.frame(COL1 = 1:5, COL2 = c("A", "B", "C", "D", "E"))

# Transform column names to lower case
colnames(df) <- tolower(colnames(df))
df
  X Y
1 1 a
2 2 b
3 3 c
4 4 d
5 5 e

The casefold function

The casefold function is a wrapper of toupper and tolower ‘provided for compatibility with S-PLUS’, which means it can be used to transform strings from uppercase to lowercase and viceversa only with one function. By default, the function will convert to lowercase:

# Sample string
x <- "SAMPLE STRING"

# Transform to lower case
x <- casefold(x)
x
"sample string"

But the function provides and argument named upper which can be set to TRUE in order to transform from lowercase to uppercase.

# Sample string
x <- "sample string"

# Transform to upper case
x <- casefold(x, upper = TRUE)
x
"SAMPLE STRING"

The chartr function

The last function we are going to review is chartr, which allows you to perform character transformations. It can be used to change one set of characters (old) to another (new) of the input data (x).

The following example illustrates how to transform the letters "s" inside "strings" to "S":

chartr(old = "s", new = "S", x = "strings")
"StringS"

Note that old and new arguments can take several characters and will transform all the individual matches:

chartr(old = "STR", new = "str", x = "STRINGS")
"strINGs"

Finally, it is worth to mention that this function is not limited to casefolding, as it also allows to replace the characters of the input string:

chartr(old = "str", new = "ABC", x = "strings")
"ABCingA"

R version 4.3.2 (2023-10-31 ucrt)