-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathmetropolis_algo.Rmd
267 lines (177 loc) · 7.56 KB
/
metropolis_algo.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
---
title: "Simple Metropolis Algorithm"
author: "Andrew"
date: "26/08/2021"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
This is a quick description of one way to write the Metropolis algorithm in R.
The full code is here, with a description of each part below. The chunk below is ready to copy and paste for your own experiments!
```{r}
# Helper function to create log likelihoods
make_likelihood <- function(f = dpois){
force(f)
function(data, ...) sum(f(data, ..., log = TRUE))
}
## define likelihoods for our problem
poisson_loglike <- make_likelihood(f = dpois)
normal_loglike <- make_likelihood(f = dnorm)
## function to use these to calculate the value of the numerator in Bayes Formula
numerator <- function(v, data = xs) {
poisson_loglike(data, lambda = exp(v)) + normal_loglike(v, mean = 1, sd = 2)
}
# proposal function to make new parameter values.
propose_new <- function(x, sig_tune) rnorm(1, mean = x, sd = sig_tune)
# start value
metropolis <- function(dataset,
n_samples,
tune_parameter,
start_value,
numerator_function = numerator){
chain <- numeric(n_samples)
chain[1] <- 0.1
for (i in 2:length(chain)){
start <- chain[i-1]
new <- propose_new(start, sig_tune = tune_parameter)
r <- exp(numerator_function(new, data = dataset) -
numerator_function(start, data = dataset))
p_accept <- min(1, r)
chain[i] <- ifelse(runif(1) < p_accept, new, start)
}
return(chain)
}
```
## Using this function and visualizing the results
```{r}
set.seed(1859)
xs <- rpois(5, 27)
chain <- metropolis(dataset = xs, n_samples = 5000, tune_parameter = 0.2, start_value = 0)
plot(exp(chain))
```
When we visualize the chain to see the samples, we see what we hope to find: the sampler is bouncing around, drawing samples from the entire posterior.
Note at the beginning the sampler has to first "travel" to find the right range of values. These samples are not part of our answer, and aren't useful to us. This is why we often drop the first few samples; these are known as the "burn-in" samples.
```{r}
set.seed(1859)
xs <- rpois(5, 27)
plot(exp(chain[300:5000]), type = "l")
```
This is a very common way of visualizing the posterior samples; this is called the "trace plot". You want it to look like this: a "fuzzy caterpillar". This means that the sampler is moving around the posterior easily, and not getting stuck in a particular zone.
### Question: why do we transform our chain with `exp()`?
In our model we applied the log link to the mean of the poisson distribution.
Look back at our function called `numerator`, which calculates the likelihood times the prior. You'll see that we are estimating the posterior for `a`, which is the _exponent_ on $\lambda$.
Our model is:
$$
\begin{align}
y &\sim \text{Poisson}(e^{a}) \\
a &\sim \text{Normal}(1, 2)
\end{align}
$$
Our chain samples are for the parameter $a$. So if we want the posterior distribution of the _mean_ of the Poisson, we need to use `exp` to transform our chain.
## Comparing to the conjugate
We have a very straightforward way of checking our work: we can calculate the conjugate posterior for our chain and compare that to our samples:
First we need to imagine a new model, with a different prior:
$$
\begin{align}
y &\sim \text{Poisson}(\lambda) \\
\lambda &\sim \text{Gamma}(25^2/15^2, 25/15^2)
\end{align}
$$
The prior looks like this:
```{r}
curve(dgamma(x, (25/15)^2, 25/15^2), xlim = c(1, 50))
```
And here is the comparison of the two:
```{r }
# plot of the prior
# plot of chain samples
plot(density(exp(chain[250:5000])), ylim = c(0,.3), lty = 2, lwd = 2,
main = "Posterior samples compared to conjugate posterior")
# plot of prior
curve(dgamma(x, (25/15)^2 + sum(xs),
25/15^2 + length(xs)),
xlim = c(1, 50),
add = TRUE, lwd = 2, col = "green")
```
And the match is pretty close!
## Breaking down the code
### Define likelihoods and priors
```{r}
# Helper function to create likelihoods
make_likelihood <- function(f = dpois){
force(f)
function(data, ...) sum(f(data, ..., log = TRUE))
}
```
This function is the gnarly part. This is a function that takes a function and _returns another function_.
In this case, we give `make_likelihood()` a probability density function, like `dnorm`, `dpois`, `dlnorm` etc. The output is a function that will calculate the log-likelihood using that distribution.
For example, we can make a function to calculate the likelihood of some data using the Poisson distribution.
```{r}
poisson_loglike <- make_likelihood(dpois)
```
In this example, `poisson_loglike()` is a new function. It has two arguments:
1. `data`, which is a vector of observations we want to use to calculate the log likelihood.
1. `...` which means "any arguments that the distribution function needs". Some will only need one argument (e.g. `dpois()` needs only `lambda`) an others would need two or more (for example `dnorm()` needs `mean` and `sd`)
Experiment to convince yourself that this function does what you expect
```{r}
poisson_loglike(xs, lambda = 5)
poisson_loglike(xs, lambda = 6)
poisson_loglike(xs, lambda = 9)
poisson_loglike(xs, lambda = 30)
```
We can even use this function to perform a simple grid search for lambda:
```{r}
values_of_lambda <- seq(5,50, length.out = 100)
ll_pois <- sapply(values_of_lambda, function(x) poisson_loglike(data = xs, lambda = x))
plot(values_of_lambda, ll_pois)
```
This hopefully helps you to see that this function accomplishes the same thing as our earlier work on likelihoods.
```{r}
normal_loglike <- make_likelihood(f = dnorm)
normal_loglike(5, mean = 10, sd = 2)
```
The next step creates another likelihood generating function.
This time the *input* is only one number.
We're going to use this function to calculate the likelihood of proposals according to the _prior_, so we are testing only one number every time.
`normal_loglike()` still contains a sum, but since there is only one number the output is the exact same as using `dnorm(5, mean = 10, sd = 2, log = TRUE)`
```{r}
normal_loglike(5, mean = 10, sd = 2)
dnorm(5, mean = 10, sd = 2, log = TRUE)
```
### Combine likelihood and prior to make the numerator
```{r}
## function to use these to calculate the value of the numerator in Bayes Formula
numerator <- function(v, data = xs) {
poisson_loglike(data, lambda = exp(v)) + normal_loglike(v, mean = 1, sd = 2)
}
```
This `numerator()` function adds together the log-likelihood of the data and the prior, calculated for some value `v`.
### Proposal function
Finally we need a _proposal function_. This takes the current value and returns a proposition.
```{r}
propose_new <- function(x, sig_tune) rnorm(1, mean = x, sd = sig_tune)
```
We can check this one to make sure it works as we intend
```{r}
propose_new(10, sig_tune = 2)
propose_new(10, sig_tune = 2)
propose_new(10, sig_tune = 2)
propose_new(10, sig_tune = 2)
```
The larger `sig_tune` the farther the new value will be from the old one.
### Run the chain
Now we just need to generate proposals and decide wether to accept or reject them.
We do this by generating a proposal, then apply the numerator function to the proposal and the original value, then calculating the acceptance probability.
```{r}
chain <- numeric(500)
chain[1] <- 0.1
for (i in 2:length(chain)){
start <- chain[i-1]
new <- propose_new(start, 0.1)
r <- exp(numerator(new) - numerator(start))
p_accept <- min(1, r)
chain[i] <- ifelse(runif(1) < p_accept, new, start)
}
plot(exp(chain))
```