MAT411 Bayesian Data Analysis

Now knowing what Bayes theorem is

$$ P(A|B) = \frac{P(A\cap B)}{P(B)} $$

$$ \frac{P(B\cap A)}{P(A)}=P(B|A) $$

so, ...

$$ P(A\cap B) = P(B\cap A) $$$$ \implies $$$$ \frac{P(A\cap B)}{P(A)}=P(B|A) $$$$\implies $$$$P(A\cap B) = P(B|A)P(A) $$

Hence,

$$ P(A|B) = \frac{P(B|A)P(A)}{P(B)} $$

Or more conventially expressed,

$$ P(\theta | D) = \frac{P(D|\theta)P(\theta)}{P(D)} $$

Lets delve deeper into its meaning

Here, $\theta$ could be thought as the probability of a belief/hypothesis, and $D$ as data set

Moreover this can be expressed as

$$ \text{posterior} = \frac{\text{likelihood} \times \text{prior}}{\text{ marginal likelihood}} $$

The prior , or $P(\theta)$, distributon is what the probability of our beliefs/hypothesis, before we start seeing the data

The likelihood, or $P(D| \theta)$, is the probability of the data given the belief/hypothesis that we have.

The posterior, $P( \theta|D)$, distribution of our beliefs/hypothesis after we see the data

And the marginal likelihood, $P(D)$ or normalized likelihood is the probability of the data under any circumstance

An example of this could be our Robber question.

$$ P(\text{Male Given Diabetic}) = \frac{P(\text{Male and Diabetic})}{P(\text{Diabetic}) }$$

We could re-write this as

$$ P(\text{Male | Diabetic}) = \frac{P(\text{Diabetic | Male})\times P(\text{Male})}{P(\text{Diabetic}) }$$

So our

prior belief is that the robber is Male

likelihood whats the probability that they are Diabetic Given Male

marginal likelihood whats the probability of being Diabetic

posterior the probability distribution that they are Male Given Diabetic

This could be seen as that given prior information, and the likelihood of it happening then that effects our Posterior view, scaled by some sort of normalizing constant

This scaling does not depend on our beliefs, its purely the distribution of the data, hence we can say that our posterior is proportional to the prior times likelihood

OR since the margnial likelihood (i.e probability of observed data) is not dependent on $\theta $

$$ \text{Posterior} \propto \text{likelihood}\times \text{Prior} $$

Given a certain prior and likelihood, we should be able to get a posterior distribution which is proportional to the true posterior.

Lets look at an unfair coin flipping

First we will generate an experiment

First lets create a prior distribution, a finite one

Remember that

$$ \text{Posterior} \propto \text{likelihood}\times \text{Prior} $$

OR $$ P(\theta | D) = \frac{P(D|\theta)P(\theta)}{P(D)} $$

So our likelihood is basically, the binomial of heads with flips

We have to find our normalized likelihood $P(D)$

Now apply Bayes rule. $$ P(\theta | D) = \frac{P(D|\theta)P(\theta)}{P(D)} $$

Lets create a random walk through our probability space, our walk is going to be limited between 0 and 1

Metropolis Algorithm

Steps

  • start somewhere
  • generate a random move and check to see if the move is legal
  • find the odds of moving = prob(new)/prob(old)
  • generate random number and see if you move
  • rinse and repeat