# Wilcoxon tests in R with wilcox.test()

The `wilcox.test`

function in R is used to perform the **signed rank test** (for comparing if the median of one sample is equal to a specific value and also paired samples test) and the Wilcoxon **rank sum test** (also known as the **Mann-Whitney U test**) for comparing two independent samples to assess whether their distributions are stochastically equal, greater or less than one another.

The Wilcoxon tests are non-parametric tests used when the data doesn’t meet the assumptions of parametric tests, such as normality.

The Kruskal-Wallis test is the generalization of the Wilcoxon test for \(k\) samples.

##
Syntax of `wilcox.test`

The `wilcox.test`

function in R has the following syntax:

```
wilcox.test(x, y = NULL,
alternative = c("two.sided", "less", "greater"),
mu = 0, paired = FALSE, exact = NULL, correct = TRUE,
conf.int = FALSE, conf.level = 0.95,
tol.root = 1e-4, digits.rank = Inf, ...)
```

Being:

`x`

: numeric vector of data values representing the first sample.`y`

: (Optional) numeric vector of data values representing the second sample for a Mann-Whitney test. If not specified (`NULL`

, by default), a one-sample Wilcoxon test is performed on`x`

.`alternative`

: alternative hypothesis. Possible values are`"two.sided"`

(by default),`"less"`

and`"greater"`

.`mu`

: the hypothesized median (for one sample test) or median difference between the samples (for two samples test).`paired`

: a logical indicating whether the samples represent paired data.`exact`

: whether to compute an exact p-value. Possible values are`TRUE`

,`FALSE`

and`NULL`

(default, where an exact p-value is computed for small to moderate-sized samples).`correct`

: a logical indicating whether to apply continuity correction.`conf.int`

: a logical indicating whether to compute confidence intervals.`conf.level`

: if`conf.int = TRUE`

is the confidence level of the interval.`tol.root`

: if`conf.int = TRUE`

is a numeric tolerance used in some calculations.`digits.rank`

: number of significant digits to compute ranks.

## Signed rank test

The Wilcoxon Signed-Rank test, is a non-parametric statistical test **used to determine whether the median of a single sample significantly differs from a hypothesized median value** (usually zero).

### Median equal to a specific value

The null and alternative hypotheses in this scenario are the following:

- \(H_0\): the median of X
**IS**\(\mu\). - \(H_1\): the median of X
**IS NOT**\(\mu\).

Being \(\mu\) the specific median value to be tested, which is usually zero.

Consider that you have the following data named `x`

represented using a box plot:

```
# Sample data
x <- c(0.22, 1.05, -0.24, 0.18, 0.16, -0.67, 2.49, -0.07, -0.13, -0.03)
# Box plot
boxplot(x, col = 4, xlab = "Sample")
```

You can perform a Wilcoxon signed rank test to check if the median of `x`

is equal to 0:

```
# Sample data
x <- c(0.22, 1.05, -0.24, 0.18, 0.16, -0.67, 2.49, -0.07, -0.13, -0.03)
# Is the median of 'x' equal to 0?
wilcox.test(x, mu = 0)
```

```
Wilcoxon signed rank exact test
data: x
V = 34, p-value = 0.5566
alternative hypothesis: true location is not equal to 0
```

The previous test returns a p-value of 0.5566, greater than the usual significance levels, so **there is no evidence to reject the null hypothesis** that the median of `x`

equals 0.

### Median greater than a specific value

The previous test was a two sided test, but you **can also perform one sided tests** with `alternative`

. When `alternative = "greater"`

the null and alternative hypotheses are the following:

- \(H_0\): the median of X is
**LOWER THAN OR EQUAL**to \(\mu\). - \(H_1\): the median of X is
**GREATER**than \(\mu\).

The following example illustrates how to test if the median of `x`

is greater than -1:

```
# Sample data
x <- c(0.22, 1.05, -0.24, 0.18, 0.16, -0.67, 2.49, -0.07, -0.13, -0.03)
# Is the median of 'x' greater than -1?
wilcox.test(x, mu = -1, alternative = "greater")
```

```
Wilcoxon signed rank exact test
data: x
V = 55, p-value = 0.0009766
alternative hypothesis: true location is greater than -1
```

The p-value is close to zero, implying there is **statistical evidence against the null hypothesis** that the median of X is lower or equal than -1.

### Median less than a specific value

The last option is to set `alternative = "less"`

. In this scenario the null and alternative hypotheses are:

- \(H_0\): the median of X is
**GREATER THAN OR EQUAL**to \(\mu\). - \(H_1\): the median of X is
**LOWER**than \(\mu\).

The following code performs a Wilcoxon test to check if the median of `x`

is less than 0:

```
# Sample data
x <- c(0.22, 1.05, -0.24, 0.18, 0.16, -0.67, 2.49, -0.07, -0.13, -0.03)
# Is the median of 'x' lower than 0?
wilcox.test(x, mu = 0, alternative = "less")
```

```
Wilcoxon signed rank exact test
data: x
V = 34, p-value = 0.7539
alternative hypothesis: true location is less than 0
```

The function returns a p-value of 0.7539, greater than the usual significance levels. Therefore, **there is no evidence to reject the null hypothesis** that the median of `x`

is greater than or equal to 0.

### Paired samples

The Wilcoxon signed rank test can also be computed for two related groups or paired observations (X and Y) to determine if their medians are significantly different. To perform this test you will need to input two vectors of data and set `paired = TRUE`

. The hypotheses are the following:

- \(H_0\): the median of X
**IS EQUAL**to the median of Y (the difference of medians is 0). - \(H_1\): the median of X
**IS DIFFERENT**than the median of Y (the difference of medians is not 0).

```
# Sample data
x <- c(0.22, 1.05, -0.24, 0.18, 0.16, -0.67, 2.49, -0.07, -0.13, -0.03)
y <- c(0.98, -1.25, 0.03, 0.25, -0.72, 1.33, 0.81, 0.09, -0.04, -1.31)
# Is the median of 'x' equal to the median of 'y'? (Paired samples)
wilcox.test(x, y, mu = 0, paired = TRUE)
```

```
Wilcoxon signed rank exact test
data: x and y
V = 31, p-value = 0.7695
alternative hypothesis: true location shift is not equal to 0
```

The test returns a p-value of 0.7695, implying there is **no statistical evidence to reject the null hypothesis** that the medians are equal (the difference of medians is zero).

## Wilcoxon rank sum test (Mann-Whitney U test) for two samples

The Wilcoxon Rank-Sum test, also known as the Mann-Whitney U test, is **employed to compare the distributions of two independent samples**. It’s used when the assumptions of the t-test (such as normality and equal variances) are not met.

### Equal medians (equal distributions)

The null and alternative hypotheses in this scenario are the following:

- \(H_0\): the median of X
**IS EQUAL**to the median of Y. - \(H_1\): the median of X
**IS DIFFERENT**to the median of Y.

Note that the previous hypotheses can be reformulated using a location shift of \(\mu\) (usually \(\mu = 0\), implying equal medians):

- \(H_0\): the location shift between X and Y
**IS**\(\mu\). - \(H_1\): the location shift between X and Y
**IS NOT**\(\mu\).

For the following examples we are going to use two independent samples named `x`

and `y`

represented in the box plot below.

```
# Sample data
x <- c(0.22, 0.8, -0.24, 0.18, 0.16, -0.67, 1.19, -0.07, -0.13, -0.03)
y <- c(0.19, 3.00, 1.34, 1.35, 1.95, 1.63, 0.68, 0.51, 1.65, 0.92)
# Box plot
boxplot(x, y, col = 4, names = c("Sample 1","Sample 2"))
```

To **test whether the two samples have the same distribution (equal medians)** you can input both vectors into the `wilcox.test`

function, as demonstrated below.

```
# Sample data
x <- c(0.22, 0.8, -0.24, 0.18, 0.16, -0.67, 1.19, -0.07, -0.13, -0.03)
y <- c(0.19, 3.00, 1.34, 1.35, 1.95, 1.63, 0.68, 0.51, 1.65, 0.92)
# Is the median of 'x' equal to the median of 'y'?
wilcox.test(x, y)
```

```
Wilcoxon rank sum exact test
data: x and y
W = 8, p-value = 0.0007253
alternative hypothesis: true location shift is not equal to 0
```

The p-value is lower than the usual significance levels, so there is evidence against the null hypothesis that the median of `x`

is equal to the median of `y`

. This suggests that **the medians of the two samples are likely not equal**, indicating a shift in the central tendencies between the groups.

### Median of X greater than the median of Y

For the one sided case, when `alternative = "greater"`

the hypotheses are the following:

- \(H_0\): the median of X
**LOWER THAN OR EQUAL**to the median of Y. - \(H_1\): the median of X
**IS GREATER**than the median of Y.

The block of code below demonstrates how to perform a Wilcoxon rank sum to test if the median of `x`

is greater than the median of `y`

.

```
# Sample data
x <- c(0.22, 0.8, -0.24, 0.18, 0.16, -0.67, 1.19, -0.07, -0.13, -0.03)
y <- c(0.19, 3.00, 1.34, 1.35, 1.95, 1.63, 0.68, 0.51, 1.65, 0.92)
# Is the median of 'x' greater than median of 'y'?
wilcox.test(x, y, alternative = "greater")
```

```
Wilcoxon rank sum exact test
data: x and y
W = 8, p-value = 0.9998
alternative hypothesis: true location shift is greater than 0
```

### Median of X less than the median of Y

The last options is to set `alternative = "less"`

, which has the following hypotheses:

- \(H_0\): the median of X
**GREATER THAN OR EQUAL**to the median of Y. - \(H_1\): the median of X
**IS LESS**than the median of Y.

Type the following to test if the median of `x`

is lower than the median of `y`

(the difference is less than 0):

```
# Sample data
x <- c(0.22, 0.8, -0.24, 0.18, 0.16, -0.67, 1.19, -0.07, -0.13, -0.03)
y <- c(0.19, 3.00, 1.34, 1.35, 1.95, 1.63, 0.68, 0.51, 1.65, 0.92)
# Is the median of 'x' lower than the median of 'y'?
wilcox.test(x, y, alternative = "less")
```

```
Wilcoxon rank sum exact test
data: x and y
W = 8, p-value = 0.0003626
alternative hypothesis: true location shift is less than 0
```

The p-value is close to zero, implying there is evidence to reject the null hypothesis that the median of `x`

is greater than or equal to the median of `y`

. Therefore, we have sufficient statistical evidence to favor the alternative hypothesis, indicating a **probable difference in the locations (medians) of the two samples**, with `x`

likely having a lower median than `y`

.