Dot plot in R

A dot plot or dot chart is similar to a scatter plot. The main difference is that the dot plot in R displays the index (each category) in the vertical axis and the corresponding value in the horizontal axis, so you can see the value of each observation following a horizontal line from the label.

These graphs can also be used as an alternative to horizontal barplots. In addition, you can label the corresponding points in the vertical axis by different groups and even sort them based on some variable.

Note that there are several types of dot charts, like the classical, the Clevelandâ€™s version and Dumbbell dot plots. In this tutorial we are going to show how to create Cleveland dot plots in R and Dumbbell charts.

The dotchart function

The `dotchart` function allows to create a Clevelandâ€™s dot plot in R. Consider the following dataset, which represents the expected and actual sales for each month of some company.

``````set.seed(1)

month <- month.name
expected <- c(15, 16, 20, 31, 11, 6,
17, 22, 32, 12, 19, 20)
sold <- c(8, 18, 12, 10, 41, 2,
19, 26, 14, 16, 9, 13)
quarter <- c(rep(1, 3), rep(2, 3), rep(3, 3), rep(4, 3))

data <- data.frame(month, expected, sold, quarter)
data``````
``````       month expected sold quarter
1    January       15    8       1
2   February       16   18       1
3      March       20   12       1
4      April       31   10       2
5        May       11   41       2
6       June        6    2       2
7       July       17   19       3
8     August       22   26       3
9  September       32   14       3
10   October       12   16       4
11  November       19    9       4
12  December       20   13       4``````

You can create a dot chart in R of the `sold` variable passing it to the `dotchart` function. You can also label each data point with the `labels` argument and specify additional arguments, like the symbol, the symbol size or the color of the symbol with the `pch`, `bg` and `pt.cex` arguments, respectively.

``dotchart(data\$sold, labels = data\$month, pch = 21, bg = "green", pt.cex = 1.5)``

Dot plot by group in R

If you have a variable that categorizes the data in groups, you can separate the dot chart in that groups, setting them in the `labels` argument. You can also specify colors for each group if wanted specifying them in the `color` argument.

``````# Groups
colors <- numeric(4)
colors[quarter == "1"] <- "red"
colors[quarter == "2"] <- "blue"
colors[quarter == "3"] <- "green"
colors[quarter == "4"] <- "orange"

dotchart(data\$expected, labels = data\$month, pch = 19,
pt.cex = 1.5, groups = rev(data\$quarter), color = colors)``````

Order dotchart in R by a variable

In addition, you can order a dot plot in R by a variable if you have your data ordered. For that purpose you can type:

``````x <- data[order(data\$expected), ]

dotchart(x\$expected, labels = x\$month, pch = 19,
xlim = range(x\$expected, x\$sold) + c(-2, 2),
pt.cex = 1.5, color = colors, groups = rev(data\$quarter))``````

Dumbbell dot plot in R

Sometimes it is interesting to create a dot chart with two variables, representing the minimum and maximum values of some events or the change of some observations in time.

In our example, it could be interesting to represent the `sold` and `expected` variables together, to analyze the difference between the expected and actual sales. This type of dotcharts are known as Dumbbell charts or Dumbbell plots.

``````dotchart(data\$sold, pch = 21, labels = data\$month, bg = "green",
pt.cex = 1.5, xlim = range(data\$expected, data\$sold) + c(-2, 2))
points(data\$expected, 1:nrow(data), col = "red", pch = 19, cex = 1.5)``````

You could also add segments and texts to label the points the following way:

``````dotchart(data\$sold, labels = data\$month, pch = 21, bg = "green",
xlim = range(data\$expected, data\$sold) + c(-2, 2),
pt.cex = 1.5)

points(data\$expected, 1:nrow(data), col = "red", pch = 19, cex = 1.5)

invisible(sapply(1:nrow(data), function(i) {
segments(min(data\$sold[i], data\$expected[i]), i,
max(data\$sold[i], data\$expected[i]), i, lwd = 2)
text(min(data\$sold[i], data\$expected[i]) - 1.5, i,
labels = min(data\$sold[i], data\$expected[i]))
text(max(data\$sold[i], data\$expected[i]) + 1.5, i,
labels = max(data\$sold[i], data\$expected[i]))
}))

points(data\$expected, 1:nrow(data), col = "red", pch = 19, cex = 1.5)
points(data\$sold, 1:nrow(data), col = "red", pch = 21, bg = "green", cex = 1.5)``````

However, this is not easy to handle, and you canâ€™t use this approach when you specify groups. As there is not any base R graphics alternative that provides this functionality, we have developed the `dumbbell` function, which works with grouped and ungrouped data. The arguments allow you to specify if you want to add the segments, the text, both or just the points and customize the plot as you want with the additional arguments.

``````# v1: numeric variable
# v2: numeric variable
# group: vector (numeric or character) or a factor containing groups
# labels: labels for the dot chart
# segments: whether to add segments (TRUE) or not (FALSE)
# text: whether to add text (TRUE) or not (FALSE)
# pch: symbol
# col1: color of the variable v1. If you want to
# add group colors add them here
# col1: color of the variable v2
# pt.cex: size of the points
# segcol: color of the segment
# lwd: width of the segment
# ... : additional arguments to be passed to dotchart function

dumbbell <- function(v1, v2, group = rep(1, length(v1)), labels = NULL,
segments = FALSE, text = FALSE, pch = 19,
colv1 = 1, colv2 = 1, pt.cex = 1, segcol = 1,
lwd = 1, ...) {

o <- sort.list(as.numeric(group), decreasing = TRUE)
group <- group[o]
offset <- cumsum(c(0, diff(as.numeric(group)) != 0))
y <- 1L:length(v1) + 2 * offset

dotchart(v1, labels = labels, color = colv1, xlim = range(v1, v2) + c(-2, 2),
groups = group, pch = pch, pt.cex = pt.cex)

if(segments == TRUE) {
for(i in 1:length(v1)) {
segments(min(v2[i], v1[i]), y[i],
max(v2[i], v1[i]), y[i],
lwd = lwd, col = segcol)
}
}

for(i in 1:length(v1)){
points(v2[i], y[i], pch = pch, cex = pt.cex, col = colv2)
points(v1[i], y[i], pch = pch, cex = pt.cex, col = colv1)
}

if(text == TRUE) {
for(i in 1:length(v1)) {
text(min(v2[i ], v1[i]) - 1.5, y[i],
labels = min(v2[i], v1[i]))
text(max(v2[i], v1[i]) + 1.5, y[i],
labels = max(v2[i], v1[i]))
}
}
}``````

With this function you can create several combinations. Consider the example where you want to show the comparison between actual sales (blue) and expected sales (black) for each month. You could write the following:

``````dumbbell(v1 = data\$expected, v2 = data\$sold, text = FALSE,
labels = data\$month, segments = TRUE, pch = 19,
pt.cex = 1.5, colv1 = 1, colv2 = "blue")``````

Now, if you want to divide the data in groups and also add texts with each value, you could type:

``````dumbbell(v1 = data\$expected, v2 = data\$sold, group = data\$quarter,
text = TRUE, labels = data\$month, segments = TRUE, pch = 19,
pt.cex = 1.5, colv1 = 1, colv2 = "blue")``````

In addition, if you want to add colors for each group you can use the `colv1` argument.

``````dumbbell(v1 = data\$expected, v2 = data\$sold, group = data\$quarter,
text = TRUE, labels = data\$month, segments = TRUE,
pch = 19, pt.cex = 1.5, colv1 = colors)``````

Finally, as we did in the previous section, you can also order the data for some variable:

``````x <- data[order(data\$expected), ]

dumbbell(v1 = x\$expected, v2 = x\$sold, group = data\$quarter,
text = TRUE, segcol = "gray", lwd = 3, labels = x\$month,
segments = TRUE, pch = 19, pt.cex = 1.5, colv1 = 1, colv2 = "blue")``````

Note that the black dots are ordered in increasing order.