Stripchart in R

Use the stripchart function to create strip charts in R programming

Stripcharts or stripplots are one dimensional scatter charts. When dealing with small sample sizes (few data points), stripcharts in R are alternatives to boxplots or scatter plots to represent the observations. The R stripchart is also useful to overplot the data to other plots, in order to show the distribution of the data. In this tutorial we will explain with examples how to make and customize a stripchart in R programming.

The R stripchart() function

The stripchart function in R allows you to create one dimensional scatter plots. In order to create a default stripchart, pass a numerical variable to the function:

set.seed(1)

x <- rnorm(20)

stripchart(x)

stripchart in base R

You can also customize the pch symbol used to create the plot, the line width and its color with the pch, lwd and col arguments, respectively. Note that symbols from 21 to 25 allows you to modify the background color of the symbol with the bg argument.

stripchart(x, pch = 21, col = 1, bg = 2, lwd = 2)

Pch colors of stripchart

By default, the function draws a box with tick labels on the X-axis. However, you can remove the box and the axes setting the axes argument to FALSE. Then, you could use the axis function to add the axes you prefer.

stripchart(x, axes = FALSE)
axis(1)
axis(2)

Using the axes argument

An alternative is to remove the box, setting the argument frame to FALSE.

stripchart(x, frame = FALSE)

Removing the frame of a stripplot

Flip stripchart axis in R

The R stripchart is drawn in horizontal by default. Nonetheless, you can flip the axes setting the argument vertical to TRUE in order to create a plot as the following:

stripchart(x, vertical = TRUE)

Flip stripchart axis in R programming

In addition, if you specify the argument las = 2, the tick marks of the vertical axis will be also vertical.

stripchart(x, vertical = TRUE, las = 2)

Vertical stripplot in R

Chart methods

The stripchart function has three different methods to draw the data. By default the function uses the method 'overplot', which in case of ties overrides the observations. Other methods are 'stack', which stacks the observations creating a plot similar to a histogram and the method 'jitter', which adds random noise in order to display the observations. Consider, for instance, the following data with ties:

set.seed(1)

x <- round(runif(100, 0 , 10))

Overplot

As we pointed out before, the overplot method is used by default by the R stripchart function. Note that in this case, since there exist ties, although there are 100 data points, we can only see 11 in the plot.

stripchart(x, pch = 19, col = 4, main = "method = 'overplot'")

Overplot method of a stripchart

Stack

In order to display all the data you can set the argument method to 'stack'. This configuration will stack the repeated data points, creating a plot that represents the distribution of the data.

stripchart(x, pch = 19, method = "stack",
           col = 4, main = "method = 'stack'")

stack method of a stripchart

Jitter

The last alternative is to use the method 'jitter', that adds random noise in the vertical axis if the plot is horizontal or in the X-axis if the plot is vertical, in order to try to show all the data points.

stripchart(x, pch = 19, method = "jitter",
           col = 4, main = "method = 'jitter'")

jitter method to add random noise

With this method, you can customize the argument jitter. For values greater (lower) than 0.2 it will increase (decrease) the amount of random noise applied to the points by default.

par(mfrow = c(1, 2))

set.seed(2)
stripchart(x, pch = 19, method = "jitter",
           col = 4, main = "method = 'jitter', jitter = 0.2")
axis(2)

set.seed(2)
stripchart(x, pch = 19, method = "jitter", jitter = 0.5,
           col = 4, main = "method = 'jitter', jitter = 0.5")
axis(2)

par(mfrow = c(1, 1))

Modifying the jitter argument

In the previous figure you can observe that the difference between the plots is the scale of the vertical axis. Nonetheless, if there are many data points some of them will overlap. In case you want to avoid the overlapping of the points we recommend you to look for beeswarm charts.

R stripchart by group

With the stripchart function you can also use a formula of the form y ~ x, where y is a numerical variable and x is a categorical variable or factor representing groups. In order to create an R stripchart by factor you can type:

set.seed(1)

x <- rnorm(100)

groups <- sample(c("A", "B", "C"), 100, replace = TRUE)

stripchart(x ~ groups, group.names = c("A", "B", "C"), pch = 19, method = "jitter",
           jitter = 0.2, vertical = TRUE, col = rainbow(length(unique(groups))))

R stripchart by factor with colors

Note that we passed a vector of colors (as many as the number of groups) to the col argument in order to add colors by group.

Add mean to R stripchart

Sometimes it can be interesting to display the mean of the data points plotted with a strip chart. For a single stripchart you have two options: adding a mean point or a mean line with the points or abline functions, respectively.

set.seed(3)
y <- rexp(50)

par(mfrow = c(1, 2))

stripchart(y, pch = 16, col = 5, method = "jitter")
points(mean(y), col = 1, pch = 7, cex = 2, lwd = 2)

stripchart(y, pch = 16, col = 5, method = "jitter")
abline(v = mean(y), col = 1, pch = 7, cex = 2, lwd = 2, lty = 2)

par(mfrow = c(1, 1))

Add mean points and mean lines to single stripchart

For a stripchart by factor you will need to calculate the mean for each group and add the mean points to each one as follows:

set.seed(3)
y <- rexp(100)

# Generating groups
groups <- sample(c("A", "B", "C"), 100, replace = TRUE)

par(mfrow = c(1, 2))

# Calculating the means
means <- sapply(levels(factor(groups)), function(i) mean(y[groups == i])) 

# Horizontal stripchart
stripchart(y ~ groups, pch = 16, col = grey.colors(3),
           method = "jitter", las = 1)
points(means, 1:3, col = "red", pch = 7, cex = 1.5, bg = 2, lwd = 2)

# Vertical stripchart
stripchart(y ~ groups, pch = 16, col = grey.colors(3),
           method = "jitter", vertical = TRUE)
points(means, col = "red", pch = 7, cex = 1.5, bg = 2, lwd = 2)

par(mfrow = c(1, 1))

Add mean points to R stripplot by group

Add a stripchart to a boxplot

Finally, it is worth to mention that stripcharts are commonly displayed over boxplots when there are few observations, due to boxplots hide the underlying distribution of the data. In order to add a stripchart over a plot you have to create a plot and then specify add = TRUE on the stripchart function.

set.seed(4)
z <- rnorm(75)

par(mfrow = c(1, 2))

# Single boxplot
boxplot(z)
stripchart(z, add = TRUE, vertical = TRUE,
           method = "jitter", col = 2, pch = 19)

# Boxplot by group
g <- sample(c("A", "B"), 75, replace = TRUE)

boxplot(z ~ g)
stripchart(z ~ g, add = TRUE, vertical = TRUE,
           method = "jitter", col = 3:4, pch = 19)

par(mfrow = c(1, 1))

R Stripchart and boxplot

Note that the argument at allows you to modify the position to draw the stripchart, so it is not strictly needed to override the main plot.

set.seed(4)
z <- rnorm(75)

boxplot(z, horizontal = TRUE, xlim = c(0.5, 2))
stripchart(z, add = TRUE, method = "jitter", col = 2, pch = 19, at = 1.75)

Position of a stripchart with the argument at