Median absolute deviation in R
The mad
function in R is used to calculate the median absolute deviation (MAD), which measures the dispersion of a dataset. The MAD is a robust alternative to standard deviation and to interquartile range that is less sensitive to outliers.
Syntax
The syntax of the mad
function is the following:
mad(x, center = median(x), constant = 1.4826,
na.rm = FALSE, low = FALSE, high = FALSE)
Being:
x
: a numeric vector.center
: the center of the data used for calculating the MAD. By default, it uses the median of x.constant
: a scale factor defaulting to 1.4826, which allows to ensure asymptotically normal consistency.na.rm
: a logical value indicating whether missing values should be removed or not. Defaults toFALSE
.low
: a logical value indicating whether to compute the ālo-medianā. Defaults toFALSE
.high
: a logical value indicating whether to compute the āhi-medianā. Defaults toFALSE
.
Examples
Given a sample vector named x
you can compute its median absolute deviation in R with the mad
function as follows:
# Sample data
set.seed(19)
x <- rnorm(100)
# MAD
mad(x)
1.057287
Recall to set na.rm = TRUE
if your data contains missing values.
Center function
The default center function is median(x)
but you could also input other function such as mean(x)
.
# Sample data
set.seed(19)
x <- rnorm(100)
# MAD with custom center function
mad(x, center = mean(x))
1.058821
Constant
The default constant of 1.4826 (1/qnorm(3/4)
) ensures asymptotically normal consistency. However, you can customize the default value with `constant
# Sample data
set.seed(19)
x <- rnorm(100)
# MAD with custom constant
mad(x, constant = 1)
0.7131301
Lo-median
By default, when the sample size is even, the function computes the mean of the two central values. However, when low = TRUE
the function takes smallest of the two middle values instead of their mean.
# Sample data
set.seed(19)
x <- rnorm(100)
# MAD lo-median
mad(x, low = TRUE)
0.7131301
Hi-median
When high = TRUE
and the sample size is even, the function takes the highest of the middle values instead of their mean.
# Sample data
set.seed(19)
x <- rnorm(100)
# MAD hi-median
mad(x, high = TRUE)
1.059213
Note that its not possible to set both low
and high
as TRUE
at the same time.