MAT411 Bayesian Data Analysis

We know from the last class that Bayes formula is rooted in the geometry of the sample space. So we can use the fact that the intersection of two sets is associative, ie $ A\cap B = B\cap A $

From this we can derive the formula that is used throughtout the foundations of Bayesian Data Analysis

First we know

$$ P(A|B) = \frac{P(A\cap B)}{P(B)} $$

and $$ \frac{P(B\cap A)}{P(A)}=P(B|A) $$

Which gives us

$$P(A\cap B) = P(A|B)P(B) $$

and

$$P(B\cap A) = P(B|A)P(A) $$

But, from our associative law

$$ P(A\cap B) = P(B\cap A) $$

Hence that implies

$$ P(A|B)P(B) = P(B|A)P(A)$$

$$\implies $$$$ P(A|B) = \frac{P(B|A)P(A)}{P(B)} $$

Or more conventially expressed,

$$ P(\theta | D) = \frac{P(D|\theta)P(\theta)}{P(D)} $$

Example for a single data point

Example: A single data interpretation of this would be a Covid test. You just went to get a covid test today and it is positive. Whats the probability that you have Covid

Here, $\theta$ would be the belief you have covid and $D$ would be the data of recieving a positive result.

So $ P(\theta | D)$ is the probabilty of you having Covid given that you have a positive test.

That is equal to $P(D|\theta) P(\theta)$ divided by $P(D)$

Today, Feb 24th 2021, in the KC metro area the percentage of positive tests is 6.1%.

Looking up some efficacy on Covid Rapid tests, we can see that True positives are about 98.5% and False positives are about 2%

$$ P(\theta | D) = \frac{P(D|\theta)P(\theta)}{P(D)} $$$$ = \frac{0.985 \times 0.061 }{0.985 \times 0.061 + 0.02 \times 0.939 }$$$$ =76 \% $$

Theoretical meanings for distributional data spaces

Here, $\theta$ could be thought as the probability of a belief/hypothesis, and $D$ as an observed data set

This could be written as the probability of your belief given observed data is equal to the probability of that data given your belief times the probability of your belief divided by the total probability of overserving the data.

Moreover this can be expressed as

$$ \text{posterior} = \frac{\text{likelihood} \times \text{prior}}{\text{ marginal likelihood}} $$

The prior , or $P(\theta)$, distributon is what the probability of our beliefs/hypothesis, before we start seeing the data

The likelihood, or $P(D| \theta)$, is the probability of the data given the belief/hypothesis that we have.

The posterior, $P( \theta|D)$, distribution of our beliefs/hypothesis after we see the data

And the marginal likelihood, $P(D)$ or normalized likelihood is the probability of the data under any circumstance

Lets look at an fair coin flipping

In [6]:
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
from scipy import stats


%matplotlib inline
%config InlineBackend.figure_format='retina'
In [49]:
flips = 10
heads_prob = 0.5

Heads_flip = stats.binom(flips,heads_prob).rvs()
Tails_flip = flips-Heads_flip
print(Heads_flip,Tails_flip)
6 4
In [50]:
num_space = 100
prob_space = np.linspace(1/num_space,num_space/(num_space+1),num_space)

uniform_space = np.ones(num_space)*1/num_space
In [51]:
plt.plot(prob_space,uniform_space)
Out[51]:
[<matplotlib.lines.Line2D at 0x7fd6a0f27430>]
In [52]:
triangle_prob = np.minimum(prob_space,1-prob_space)
triangle_space = triangle_prob/triangle_prob.sum()
plt.plot(triangle_space)
Out[52]:
[<matplotlib.lines.Line2D at 0x7fd6d12f5bb0>]
In [53]:
plt.plot(prob_space,uniform_space)
plt.plot(prob_space,triangle_space)
Out[53]:
[<matplotlib.lines.Line2D at 0x7fd680e89430>]

We have the Priors

Now to find the likelihood given Heads and Tails

In [54]:
prob_space
Out[54]:
array([0.01      , 0.01989999, 0.02979998, 0.03969997, 0.04959996,
       0.05949995, 0.06939994, 0.07929993, 0.08919992, 0.09909991,
       0.1089999 , 0.11889989, 0.12879988, 0.13869987, 0.14859986,
       0.15849985, 0.16839984, 0.17829983, 0.18819982, 0.19809981,
       0.2079998 , 0.21789979, 0.22779978, 0.23769977, 0.24759976,
       0.25749975, 0.26739974, 0.27729973, 0.28719972, 0.29709971,
       0.3069997 , 0.31689969, 0.32679968, 0.33669967, 0.34659966,
       0.35649965, 0.36639964, 0.37629963, 0.38619962, 0.39609961,
       0.4059996 , 0.41589959, 0.42579958, 0.43569957, 0.44559956,
       0.45549955, 0.46539954, 0.47529953, 0.48519952, 0.49509951,
       0.5049995 , 0.51489949, 0.52479948, 0.53469947, 0.54459946,
       0.55449945, 0.56439944, 0.57429943, 0.58419942, 0.59409941,
       0.6039994 , 0.61389939, 0.62379938, 0.63369937, 0.64359936,
       0.65349935, 0.66339934, 0.67329933, 0.68319932, 0.69309931,
       0.7029993 , 0.71289929, 0.72279928, 0.73269927, 0.74259926,
       0.75249925, 0.76239924, 0.77229923, 0.78219922, 0.79209921,
       0.8019992 , 0.81189919, 0.82179918, 0.83169917, 0.84159916,
       0.85149915, 0.86139914, 0.87129913, 0.88119912, 0.89109911,
       0.9009991 , 0.91089909, 0.92079908, 0.93069907, 0.94059906,
       0.95049905, 0.96039904, 0.97029903, 0.98019902, 0.99009901])
In [55]:
def nCk(n,k):
    return np.math.factorial(n)/(np.math.factorial(n-k)*np.math.factorial(k))
In [56]:
prob_like = nCk(flips,Heads_flip)*prob_space**(Heads_flip)*(1-prob_space)**(Tails_flip)
In [57]:
prob_like
Out[57]:
array([2.01725162e-10, 1.20342220e-08, 1.30304750e-07, 6.99178614e-07,
       2.55113262e-06, 7.29044757e-06, 1.75964950e-05, 3.75257018e-05,
       7.27944592e-05, 1.31028381e-04, 2.21966228e-04, 3.57609488e-04,
       5.52311074e-04, 8.22798838e-04, 1.18813165e-03, 1.66958763e-03,
       2.29048577e-03, 3.07594366e-03, 4.05257527e-03, 5.24813398e-03,
       6.69110683e-03, 8.41026683e-03, 1.04341910e-02, 1.27907517e-02,
       1.55065903e-02, 1.86065812e-02, 2.21132946e-02, 2.60464682e-02,
       3.04224940e-02, 3.52539311e-02, 4.05490507e-02, 4.63114214e-02,
       5.25395423e-02, 5.92265305e-02, 6.63598680e-02, 7.39212141e-02,
       8.18862882e-02, 9.02248252e-02, 9.89006075e-02, 1.07871576e-01,
       1.17090020e-01, 1.26502845e-01, 1.36051925e-01, 1.45674524e-01,
       1.55303799e-01, 1.64869371e-01, 1.74297960e-01, 1.83514091e-01,
       1.92440839e-01, 2.01000641e-01, 2.09116130e-01, 2.16711017e-01,
       2.23710980e-01, 2.30044580e-01, 2.35644169e-01, 2.40446794e-01,
       2.44395079e-01, 2.47438078e-01, 2.49532080e-01, 2.50641368e-01,
       2.50738899e-01, 2.49806915e-01, 2.47837463e-01, 2.44832806e-01,
       2.40805730e-01, 2.35779732e-01, 2.29789065e-01, 2.22878661e-01,
       2.15103892e-01, 2.06530194e-01, 1.97232528e-01, 1.87294685e-01,
       1.76808434e-01, 1.65872514e-01, 1.54591472e-01, 1.43074356e-01,
       1.31433274e-01, 1.19781819e-01, 1.08233397e-01, 9.68994550e-02,
       8.58876432e-02, 7.52999360e-02, 6.52307406e-02, 5.57650293e-02,
       4.69765347e-02, 3.89260527e-02, 3.16599044e-02, 2.52086105e-02,
       1.95858424e-02, 1.47877150e-02, 1.07924988e-02, 7.56082983e-03,
       5.03650724e-03, 3.14797466e-03, 1.81058872e-03, 9.29787184e-04,
       4.05277468e-04, 1.36375205e-04, 2.86319962e-05, 1.90110260e-06])
In [58]:
plt.plot(prob_space,prob_like,'.')
Out[58]:
[<matplotlib.lines.Line2D at 0x7fd680966be0>]

We have to find $P(D)$

In [60]:
prob_norm = np.sum(prob_space*prob_like)

Now all we have to do is apply Bayes Rule $$ P(\theta | D) = \frac{P(D|\theta)P(\theta)}{P(D)} $$

In [62]:
prob_prior = uniform_space

prob_post = prob_like*prob_prior
In [64]:
plt.plot(prob_space, prob_post)
Out[64]:
[<matplotlib.lines.Line2D at 0x7fd680e6ad00>]
In [68]:
plt.figure(figsize=(13,6))
plt.stem(prob_space,prob_prior, markerfmt=',')
plt.show()
In [69]:
plt.figure(figsize=(13,6))
plt.stem(prob_space,prob_like, markerfmt=',')
plt.show()
In [70]:
plt.figure(figsize=(13,6))
plt.stem(prob_space,prob_post, markerfmt=',')
plt.show()
In [72]:
plt.figure(figsize=(13,6))
plt.stem(prob_space,prob_prior, markerfmt=',')
plt.title('prior')
plt.show()

plt.figure(figsize=(13,6))
plt.stem(prob_space,prob_like, markerfmt=',')
plt.title('likelihood')
plt.show()

plt.figure(figsize=(13,6))
plt.stem(prob_space,prob_post, markerfmt=',')
plt.title('posterior')
plt.show()

lets put this into one cell and see the dynamics of this change

In [78]:
flips = 6
heads_prob = 0.5

Heads_flip = stats.binom(flips,heads_prob).rvs()
Tails_flip = flips-Heads_flip
print(Heads_flip,Tails_flip)


num_space = 100
prob_space = np.linspace(1/num_space,num_space/(num_space+1),num_space)
uniform_space = np.ones(num_space)*1/num_space

triangle_prob = np.minimum(prob_space,1-prob_space)
triangle_space = triangle_prob/triangle_prob.sum()

prob_like = nCk(flips,Heads_flip)*prob_space**(Heads_flip)*(1-prob_space)**(Tails_flip)

prob_prior = triangle_space

prob_norm = np.sum(prob_space*prob_like)

prob_post = prob_like*prob_prior/prob_norm


plt.figure(figsize=(13,6))
plt.stem(prob_space,prob_prior, markerfmt=',')
plt.title('prior')
plt.show()

plt.figure(figsize=(13,6))
plt.stem(prob_space,prob_like, markerfmt=',')
plt.title('likelihood')
plt.show()

plt.figure(figsize=(13,6))
plt.stem(prob_space,prob_post, markerfmt=',')
plt.title('posterior')
plt.show()
2 4
In [ ]: