
Bayes Theorem - Teorema de Bayes
The **[theorem]** is a result of probability theory that states that:
math P(A|B) = \frac{P(B|A) P(A)}{P(B)} math
where P(A|B) is the conditional probability of event A given event B, P(A) is the probability of event A, and so on.
Let's wonder about the following problem: given a histogram of the frequencies measured of some event, what is the probability that the real distribution is a given function?
Let's simplify by assuming an experiment with two possible outcomes (1 and 2). Let's assume that in a statistical survey, event 1 is find n1 times and event 2 n2 times. What is the probability that the real probability of event 1 is p?
By the Bayes' theorem we have:
math P(p | n_1,n_2) = \frac{P(n_1,n_2| p) P(p) }{P(n1,n2)} math
As we do not now the (unconditional) probability of the distribution being p, let's make the assumption that any distribution is as good as any other, that is, let's assume that P(p) is a constant. As P(n1,n2) does not depend on p, and we must satisfy: math \int_0^1 P(p| n_1, n_2) dp = 1 math
Then we have: math P(p | n_1,n_2) = \frac{P(n_1,n_2| p) }{ \int_0^1 P(n_1, n_2|p) dp} math
The probability that 1 happens n1 times and 2 happens n2 times given that the probability of 1 is p is given by the binomial distribution: math P(n_1,n_2| p) \propto p^{n_1} (1-p)^{n_2} math
Noting that: math \int p^{n_1} (1-p)^{n_2} dp = \frac{n_1!n_2!}{(n_1+n_2+1)!} math
we finally get: math P(p | n_1,n_2) = \frac{p^{n_1} (1-p)^{n_2}}{ \int_0^1 p^{n_1} (1-p)^{n_2} dp} = \frac{(n_1+n_2+1)!}{n_1!n_2!}p^{n_1} (1-p)^{n_2} math
The mean probability of 1 happening is: math \left\langle p \right\rangle = \int P(p | n_1,n_2) p dp = \frac{(n_1+n_2+1)!}{n_1!n_2!} \int p^{n_1+1} (1-p)^{n_2} dp =\frac{(n_1+n_2+1)!}{n_1!n_2!}\frac{(n_1+1)!n_2!}{(n_1+n_2+2)!} math So: math \left\langle p \right\rangle = \frac{n_1+1}{n_1+n_2+2} math
The mean quadratic probability is: math \left\langle p^2 \right\rangle = \int P(p | n_1,n_2) p^2 dp = \frac{(n_1+n_2+1)!}{n_1!n_2!} \int p^{n_1+2} (1-p)^{n_2} dp =\frac{(n_1+n_2+1)!}{n_1!n_2!}\frac{(n_1+2)!n_2!}{(n_1+n_2+3)!} math
So: math \left\langle p^2 \right\rangle = \frac{(n_1+1)(n_1+2)}{(n_1+n_2+2)(n_1+n_2+3)} math
And the variance is: math \sigma = \left\langle p^2 \right\rangle - \left\langle p \right\rangle^2 = \frac{(n_1+1)(n_1+2)}{(n_1+n_2+2)(n_1+n_2+3)} - \frac{(n_1+1)^2}{(n_1+n_2+2)^2} math math \sigma = \frac{(n_1+1)(n_1+2)(n_1+n_2+2) - (n_1+1)^2(n_1+n_2+3)}{(n_1+n_2+2)^2(n_1+n_2+3)} math