Saturday, October 22, 2011

Hypergeometric (draws w/o replacement) and Binomial / Bernoulli (draws with replacement) distributions

In probability theory and statistics, the binomial distribution is the discrete probability distribution of the number of successes in a sequence of n independent yes/no experiments, each of which yields success with probability p. Such a success/failure experiment is also called a Bernoulli experiment or Bernoulli trial; when n = 1, the binomial distribution is a Bernoulli distribution. The Binomial distribution is an n times repeated Bernoulli trial. The binomial distribution is the basis for the popular binomial test of statistical significance.

The binomial distribution is frequently used to model the number of successes in a sample of size n drawn with replacement from a population of size N. If the sampling is carried out without replacement, the draws are not independent and so the resulting distribution is a hypergeometric distribution, not a binomial one. However, for N much larger than n, the binomial distribution is a good approximation, and widely used.

http://en.wikipedia.org/wiki/Binomial_distribution
http://en.wikipedia.org/wiki/Hypergeometric_distribution

http://stattrek.com/online-calculator/hypergeometric.aspx
Draw 5 cards from the deck, what are the chances that 4 are red?

> tot <- 52; m <- 26; n <- tot-m; k <- 5; q <- 4; dhyper(q,m,n,k)
[1] 0.1495598


> tot <- 52; m <- 26; n <- tot-m; k <- 5; q <- 4; phyper(q,m,n,k)
[1] 0.9746899



No comments: