Median absolute deviation in R

Statistics with R Dispersion measures
Median absolute deviation in R

The mad function in R is used to calculate the median absolute deviation (MAD), which measures the dispersion of a dataset. The MAD is a robust alternative to standard deviation and to interquartile range that is less sensitive to outliers.

Syntax

The syntax of the mad function is the following:

mad(x, center = median(x), constant = 1.4826,
    na.rm = FALSE, low = FALSE, high = FALSE)

Being:

  • x: a numeric vector.
  • center: the center of the data used for calculating the MAD. By default, it uses the median of x.
  • constant: a scale factor defaulting to 1.4826, which allows to ensure asymptotically normal consistency.
  • na.rm: a logical value indicating whether missing values should be removed or not. Defaults to FALSE.
  • low: a logical value indicating whether to compute the “lo-median”. Defaults to FALSE.
  • high: a logical value indicating whether to compute the “hi-median”. Defaults to FALSE.

Examples

Given a sample vector named x you can compute its median absolute deviation in R with the mad function as follows:

# Sample data
set.seed(19)
x <- rnorm(100)

# MAD
mad(x)
1.057287

Recall to set na.rm = TRUE if your data contains missing values.

Center function

The default center function is median(x) but you could also input other function such as mean(x).

# Sample data
set.seed(19)
x <- rnorm(100)

# MAD with custom center function
mad(x, center = mean(x))
1.058821

Constant

The default constant of 1.4826 (1/qnorm(3/4)) ensures asymptotically normal consistency. However, you can customize the default value with `constant

# Sample data
set.seed(19)
x <- rnorm(100)

# MAD with custom constant
mad(x, constant = 1)
0.7131301

Lo-median

By default, when the sample size is even, the function computes the mean of the two central values. However, when low = TRUE the function takes smallest of the two middle values instead of their mean.

# Sample data
set.seed(19)
x <- rnorm(100)

# MAD lo-median
mad(x, low = TRUE)
0.7131301

Hi-median

When high = TRUE and the sample size is even, the function takes the highest of the middle values instead of their mean.

# Sample data
set.seed(19)
x <- rnorm(100)

# MAD hi-median
mad(x, high = TRUE)
1.059213

Note that its not possible to set both low and high as TRUE at the same time.