Home » Statistics with R » Poisson distribution in R

# Poisson distribution in R

## The Poisson distribution

Denote a Poisson process as a random experiment that consist on observe the occurrence of specific events over a continuous support (generally the space or the time), such that the process is stable (the number of occurrences, \lambda is constant in the long run) and the events occur randomly and independently.

The Poisson distribution is used to model the number of events that occur in a Poisson process. Let X \sim P(\lambda), this is, a random variable with Poisson distribution where the mean number of events that occur at a given interval is \lambda:

• The probability mass function (PMF) is P(X = x) =\frac{e^{- \lambda} \lambda^x}{x!} for x = 0, 1, 2, \dots.
• The cumulative distribution function (CDF) is F(x) =\sum_{i = 0}^x \frac{e ^{- \lambda} \lambda^i}{i!}.
• The quantile function is Q(p) = F ^{-1}(p).
• The expected mean and variance of X are E(X) = Var(X) = \lambda.

The functions described in the list before can be computed in R for a set of values with the dpois (probability mass), ppois (distribution) and qpois (quantile) functions. Moreover, the rpois function allows obtaining n random observations that follow a Poisson distribution. The table below describes briefly each of these functions.

## The dpois function

The Poisson probability function with mean \lambda can be calculated with the R dpois function for any value of x. The following block of code summarizes the arguments of the function:

dpois(x,           # X-axis values (x = 0, 1, 2, ...)
lambda,      # Mean number of events that occur on the interval
log = FALSE) # If TRUE, probabilities are given as log

As an example, if you want to calculate the Poisson mass probability function for x \in \{0, 1, \dots, 10\} with mean 5, you can type:

dpois(0:10, lambda = 5)
0.006737947 0.033689735 0.084224337 0.140373896 0.175467370 0.175467370
0.146222808 0.104444863 0.065278039 0.036265577 0.018132789

You can also specify a vector of means instead of a single value, as in the following block:

dpois(5, lambda = c(5, 10)) # 0.17546737 0.03783327

In the previous example, the first element of the output is from a distribution with mean \lambda = 5 and the second from a distribution with mean \lambda = 10 events per interval.

### Plot of the Poisson probability function in R

The Poisson probability mass function can be plotted in R making use of the plot function, as in the following example:

# Grid of X-axis values
x <- 0:50

#-----------
# lambda: 5
#-----------
lambda <- 5
plot(dpois(x, lambda), type = "h", lwd = 2,
main = "Poisson probability function",
ylab = "P(X = x)", xlab = "Number of events")

#-----------
# lambda: 10
#-----------
lambda <- 10
lines(dpois(x, lambda), type = "h", lwd = 2, col = rgb(1,0,0, 0.7))

#-----------
# lambda: 20
#-----------
lambda <- 20
lines(dpois(x, lambda), type = "h", lwd = 2, col = rgb(0, 1, 0, 0.7))

# Legend
legend("topright", legend = c("5", "10", "20"),
title = expression(lambda), title.adj = 0.75,
lty = 1, col = 1:3, lwd = 2, box.lty = 0)

## The ppois function

The probability of a variable X following a Poisson distribution taking values equal or lower than x can be calculated with the ppois funtion, which arguments are described below:

ppois(q,                 # Quantile or vector of quantiles
lambda,            # Mean or vector of means
lower.tail = TRUE, # If TRUE, probabilities are P(X <= x), or P(X > x) otherwise
log.p = FALSE)     # If TRUE, probabilities are given as log

If you want to calculate, for instance, the probability of observing 5 or less events (P(X \leq 5)) if the mean of events occurring on a specific interval is 10 you can type:

ppois(5, lambda = 10) # 0.06708596

In this example, the previous result is equivalent to the sum of the probabilities of each value up to 5:

sum(dpois(0:5, lambda = 10)) # 0.06708596

### ppois example

In this section we are going to present a more detailed example using the ppois function. Consider that the number of visits on a web page is known to follow a Poisson distribution with mean 15 visits per hour. Hence, \lambda = 15.

• The probability of getting 10 or less visits per hour, P(X \leq 10), is:
ppois(10, lambda = 15) # 0.1184644 or 11.8%
1 - ppois(10, lambda = 15, lower.tail = FALSE) # Equivalent
sum(dpois(0:10, lambda = 15)) # Equivalent

As the Poisson distribution is discrete, the cumulative probability is calculated adding the corresponding probabilities of the probability function. The following R function allows to visualize the probabilities that are added based on a lower bound and an upper bound.

# lambda: mean
# lb: lower bound of the sum
# ub: upper bound of the sum
# col: color
# lwd: line width
pois_sum <- function(lambda, lb, ub, col = 4, lwd = 1, ...) {
x <- 0:(lambda + lambda * 2)

if (missing(lb)) {
lb <- min(x)
}
if (missing(ub)) {
ub <- max(x)
}

plot(dpois(x, lambda = lambda), type = "h", lwd = lwd, ...)

if(lb == min(x) & ub == max(x)) {
color <- col
} else {
color <- rep(1, length(x))
color[(lb + 1):ub ] <- col
}

lines(dpois(x, lambda = lambda), type = "h",
col =  color, lwd = lwd, ...)
}

By way of illustration, if you want to display the probabilities that have been added to calculate the probability of observing between 10 and 15 events, if 10 events occur on average on each interval, you can type:

 pois_sum(lambda = 10, lb = 10, ub = 15, lwd = 2,
col = 2, ylab = "P(X = x)", xlab = "Number of events")
The sum of the probabilities displayed in red is equal to ppois(15, lambda = 10) - ppois(10, lambda = 10).

The calculated probability (11.8%) corresponds to the sum of the following probabilities:

pois_sum(lambda = 15, ub = 10, lwd = 2,
ylab = "P(X = x)", xlab = "Visits per hour")
• The probability of getting more than 20 visits per hour, P(X > 20), is:
ppois(20, lambda = 15, lower.tail = FALSE) # 0.08297091 or 8.3%
1 - ppois(20, lambda = 15)        # Equivalent
1 - sum(dpois(0:20, lambda = 15)) # Equivalent

This probability corresponds to:

pois_sum(lambda = 15, lb = 20, lwd = 2,
ylab = "P(X = x)", xlab = "Visits per hour")
• The probability of getting less than 15 visits per hour, P(X < 15) is:
ppois(14, lambda = 15, lower.tail = FALSE) # 0.5343463 or 53.43%
1 - sum(dpois(0:15, lambda = 15)) # Equivalent

Note that we set 14 instead of 15, because the Poisson probability is discrete, so P(X < 15) =P(X \leq 14). The corresponding plot is as follows:

pois_sum(lambda = 15, ub = 14, lwd = 2,
ylab = "P(X = x)", xlab = "Visits per hour")
• The probability of receiving between 10 and 20 visits per hour is:
ppois(20, lambda = 15) - ppois(10, lambda = 15) # 0.7985647 or 79.86%
sum(dpois(11:20, lambda = 15)) # Equivalent

The probability can be represented making use of the function we defined before:

pois_sum(lambda = 15, lb = 10, ub = 20, lwd = 2,
ylab = "P(X = x)", xlab = "Visits per hour")
As the Poisson distribution is a discrete distribution P(X = x) \neq 0, so P(X \geq x) \neq P(X > x) and P(X \leq x) \neq P(X < x).

### Plot of the Poisson distribution function in R

The cumulative distribution of the Poisson distribution can be represented for different values of \lambda with the following block of code:

# Grid of X-axis values
x <- 0:50

#-----------
# lambda: 5
#-----------
lambda <- 5
plot(ppois(x, lambda), type = "s", lwd = 2,
main = "Poisson distribution function",
xlab = "Number of events", ylab = "F(x)")

#-----------
# lambda: 10
#-----------
lambda <- 10
lines(ppois(x, lambda), type = "s", lwd = 2, col = 2)

#-----------
# lambda: 20
#-----------
lambda <- 20
lines(ppois(x, lambda), type = "s", lwd = 2, col = 3)

# Legend
legend("bottomright", legend = c("5", "10", "20"),
title = expression(lambda), title.adj = 0.75,
lty = 1, col = 1:3, lwd = 2, box.lty = 0)

## The qpois function

The R qpois function allows obtaining the corresponding Poisson quantiles for a set of probabilities.

qpois(p,                 # Probability or vector of probabilities
lambda,            # Mean or vector of means
lower.tail = TRUE, # If TRUE, probabilities are P(X <= x), or P(X > x) otherwise
log.p = FALSE)     # If TRUE, probabilities are given as log

For instance, the quantile 0.5 of a Poisson distribution is equal to the mean:

qpois(0.5, lambda = 10) # 10

### Plot of the Poisson quantile function

The Poisson quantile function can be plotted in R for a set of probabilities. The following graph shows the outcomes of the qpois function for different means.

#-----------
# lambda: 20
#-----------
plot(qpois(seq(0, 1, 0.001), lambda = 20),
main = "Poisson quantile functions",
ylab = "Q(p)", xlab = "p",
type = "s", col = 3, xaxt = "n")

axis(1, labels = seq(0, 1, 0.1), at = 0:10 * 100)

#-----------
# lambda: 10
#-----------
lines(qpois(seq(0, 1, 0.001), lambda = 10), type = "s", col = 2)

#-----------
# lambda: 5
#-----------
lines(qpois(seq(0, 1, 0.001), lambda = 5), type = "s")

# Legend
legend("topleft", legend = c("5", "10", "20"),
title = expression(lambda), title.adj = 0.75,
lty = 1, col = 1:3, lwd = 2, box.lty = 0)

## The rpois function

If you want to draw n observations from a Poisson distribution you can make use of the rpois function. The following block of code summarizes the arguments of the function.

rpois(n,      # Number of random observations to be generated
lambda) # Mean or vector of means

If you want to obtain 10 random observations from a Poisson distribution with mean 4 in R you can type:

rpois(10, lambda = 4)
7 6 2 2 3 6 3 4 4 7

However, the previous output won’t be reproducible. In case you need to generate a reproducible sequence of numbers you can set a seed with any integer number as follows:

set.seed(10)
rpois(10, lambda = 4)
4 3 3 5 1 2 3 3 4 3