Calculate the mean in R
The mean, also known as the expected value in Statistics, is a measure of central tendency which represents the average of the data. Generally, is the sum of all observations divided by the number of observations of the data (arithmetic mean). In this tutorial we will review how to calculate the arithmetic mean as well as the trimmed, geometric and weighted means in R.
Arithmetic mean with the mean function
In order to calculate the arithmetic mean of a vector we can make use of the mean
function. Consider the following sample vector that represents the exam qualifications of a student during the current year:
x <- c(2, 4, 3, 6, 3, 7, 5, 8)
Using the mean
function we can calculate the mean of the qualifications of the student:
mean(x) # 4.75
# Equivalent to:
sum(x)/lenght(x) # 4.75
Note that, if for some reason some elements of the vector are missing (the vector contains some NA
values), you should set the na.rm
argument of the function to TRUE
. Otherwise, the output will be an NA
.
# Vector with NA
x <- c(2, 4, 3, 6, 3, 7, 5, 8, NA)
# If the vector contains an NA value, the result will be NA
mean(x) # NA
# Remove the NA values
mean(x, na.rm = TRUE) # 4.75
Trimmed mean in R
The trimmed arithmetic mean removes a fraction of observations from each end of the vector before the mean is computed. This is specially interesting when the vector contains outliers of some data we don’t want to be used when calculating the mean. For instance, if we trim our data to the 10% only the 80% of the central data will be used to compute the mean.
# Sample vector
y <- c(1, rep(5, 8), 50)
# Arithmetic mean
mean(y) # 9.1
# Trimmed arithmetic mean to the 10%
# (removes the first and the last element on this example)
mean(y, trim = 0.1) # 5
Weighted mean in R with the weighted.mean function
The arithmetic mean considers that each observation has the same relevance than the others. If we want to assign a different relevance for each observation we can assign a different weight to each observation (the arithmetic mean considers the same weight for all observations).
In order to assign weights we can make use of the weighted.mean
function as follows:
# Sample vector
z <- c(5, 7, 8)
# Weights (should sum up to 1)
wts <- c(0.2, 0.2, 0.6)
# Weighted mean
weighted.mean(z, w = wts) # 7.2
Note that the latter is equivalent to:
sum(z * wts) # 7.2
If your data contains any NA
value the function also provides the na.rm
argument.
Geometric mean in R
The geometric mean is the n-th root of the product of the elements of the vector. In order to calculate it you can use the exp
, mean
and log
functions or use the geometric.mean
function from psych
, which includes the na.rm
argument if needed.
# Sample vector
w <- c(10, 20, 15, 40)
# Geometric mean
exp(mean(log(w))) # 18.6121
# Alternative (which includes the na.rm argument)
# install.packages("psych")
library(psych)
geometric.mean(w) # 18.6121