






















Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
An introduction to bayesian inference, likelihood function, bayes theorem, conjugate prior, and hierarchical bayes model in genomics. It also covers the beta-binomial and normal-normal models, and the effect of changes in prior variance. R scripts and examples for better understanding.
Typology: Study notes
1 / 30
This page cannot be seen from the preview
Don't miss anything!
Grzegorz A. Rempala
Department of Biostatistics Medical College of Georgia
August 27-Sept 8, 2008
1 Intro: Bayesian Inference Likelihood Function Bayes Theorem: Prior and Posterior Conjugate Prior
(^2) Binomial Model Example One: Beta–Binomial
(^3) Normal Model Example Two: Normal–Normal School Data: Hierarchical Bayesian Model
4 Hierarchical Bayes Hyper Prior Empirical Bayes
A likelihood function doesn’t tell us the probability distribution of parameters; it gives the ranking of ”likelihood” of a parameter value being correct. For example, the likelihood function of success probability given 4 heads out of 10 flips is,
L(p) =
p^4 (1 − p)^6 ∝ p^4 (1 − p)^6.
R Script
bin.lik <- function(p){ p^4 * (1-p)^ } p <- seq(0,1,by=0.02) plot(p,bin.lik(p),ylab="likelihood",type="l") abline(v=0.4,lty=2)
In order to obtain a probability distribution of a parameter we need to perform Bayesian Inference. First we need a prior distribution. Bayes Theorem:
f (θ|x) = f (x|θ)f (θ) f (x) ∝ f (x|θ)f (θ)
Posterior(θ) ∝ Likelihood(θ) × Prior(θ) f (x) is normalizing constant to ensure the posterior sum/integrate to one. Prior (θ) represents the information about an parameter θ, before observing the data. Prior distribution is updated by the data to yield Posterior(θ).
We can use different kinds of probability distribution for prior distribution. However we will use only conjugate priors here. A conjugate prior gives a posterior that has the same form as the prior distribution and usually with a very simple mathematical form.
Conjugate Priors Likelihood Conjugate Prior Posterior
Normal Normal Normal Binomial Beta Beta
Suppose we have prior belief that the coin is fair, i.e. p = 1/ 2. α β Mean Var 1 1 1/2 0. 6 6 1/2 0. 12 12 1/2 0. 18 18 1/2 0.
0.0 0.2 0.4 0.6 0.8 1.
0
1
2
3
4
p
Prob. Density
(18,18) (12,12)
(6,6)
(1,1)
Figure: Shapes for Beta priors. Note that Beta (1,1) is non-informative.
R Script (Beta Distribution)
a <- c(1,6,12,18) b <- c(1,6,12,18) #Variance a*b/( (a+b)^2 * (a+b+1))
x <- seq(0,1,by=0.01) y <- dbeta(x, a[4], b[4]) plot(x,y, type="l",xlab="p" , ylab="Prob. Density") text(0.5,dbeta(0.5,a[4],a[4]),"(18,18)") y <- dbeta(x, a[3], b[3]) lines(x, y); text(0.5,dbeta(0.5,a[3],a[3]),"(12,12)") y <- dbeta(x, a[2], b[2]) lines(x, y); text(0.5,dbeta(0.5,a[2],a[2]),"(6,6)") y <- dbeta(x, a[1], b[1]) lines(x, y); text(0.5,dbeta(0.5,a[1],a[1]),"(1,1)")
0.0 0.2 0.4 0.6 0.8 1.
1
2
3
4
5
p
Post. Density
m= 0.463 v= 0.
m= 0.448 v= 0.
m= 0.412 v= 0.
m= 0.286 v= 0.
Figure: Beta posteriors for n = 5 and y = 1 using four previous priors.
R Script (Posterior Beta Distribution)
y=1; n=5; a <- c(1,6,12,18)+y; b <- c(1,6,12,18)+n-y; #Mean and Variance m=a/(a+b); v=a*b/( (a+b)^2 * (a+b+1));
x <- seq(0,1,by=0.01); y <- dbeta(x, a[4], b[4]) plot(x,y, type="l",xlab="p" , ylab="Post. Density"); text(0.7,dbeta(0.5,a[4],a[4]), paste("m=",round(m[4],3)," ","v=",round(v[4],3))); y <- dbeta(x, a[3], b[3]); lines(x, y); text(0.7,dbeta(0.5,a[3],a[3]), paste("m=",round(m[3],3)," ","v=",round(v[3],3))); y <- dbeta(x, a[2], b[2]); lines(x, y); text(0.7,dbeta(0.5,a[2],a[2]), paste("m=",round(m[2],3)," ","v=",round(v[2],3))); y <- dbeta(x, a[1], b[1]); lines(x, y); text(0.8,dbeta(0.5,a[1],a[1]), paste("m=",round(m[1],3)," ","v=",round(v[1],3)));
Let the scores of a sample of n students in a standardized test be : y 1 , y 2 ,... , yn. Assume they follow normal distribution, i.e. yi ∼ N (μ, σ^2 ) where σ^2 is known. Therefore
y ¯ ∼ N
μ, σ^2 /n
From another school district, we know that the mean score change for each school in the same district following μ ∼ N (μ 0 , τ 2 ) To estimate the mean test score of the school, we can use MLE, i.e.
y ¯ =
∑n i=1 yi n
We could also use the Bayesian method. The likelihood is
f (¯y|μ) ∝
∏^ n
i=
exp
n 2
(¯y − μ)^2 σ^2
The prior is
f (μ) ∝ exp
(μ − μ 0 )^2 τ 02
Therefore the posterior distribution is
f (μ|y) ∝ exp
n σ^2 (¯y − μ)^2 +
τ 02
(μ − μ 0 )^2
∝ exp
(μ − μ 1 )^2 τ 12
Now assume {yi} = { 90 , 75 , 92 , 65 }. ¯y = 80. 5 and σ^2 = 100. Suppose μ 0 = 60. Under different values of τ 02 we have different posterior mean and variance.
Prior Variance (τ 02 ) Post. Mean Post. Var
5 63.42 4. 20 69.11 11. 30 71.18 13. 60 74.47 17.
50 60 70 80 90 100
tau = 5
score
y.posterior
0.000 50 60 70 80 90 100
tau = 20
score
y.posterior
50 60 70 80 90 100
tau = 30
score
y.posterior
50 60 70 80 90 100
tau = 60
score
y.posterior