report.Rmd

---
title: "Validation results"
output: html_document
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
options(tidyverse.quiet = TRUE)
```

This report contains the validation results of a small Bayesian model. Here, we summarize the results computed in earlier targets of the pipeline. We reference our targets with `tar_load()` and `tar_read()`. This ensures

1. Because of the `tar_render()` function from the [`tarchetypes`](https://wlandau.github.io/tarchetypes) package (see `_targets.R`) `targets` automatically detects the dependencies of this report and rebuilds it when its dependencies change.
1. We can run the report by itself if the targets are already in the `_targets/` data store.

## Continuous covariate

```{r}
library(targets)
library(tidyverse)
tar_load(fit_continuous)
```

Our results are in a data frame with one row per simulated dataset and columns with information about our fitted models.

```{r, paged.print = FALSE}
fit_continuous
```

If we implemented the model in `stan/model.stan` correctly, then roughly 90% of model fits should cover the true `beta` parameter that generated the data in 90% credible intervals.

```{r}
mean(fit_continuous$cover_beta)
```

The posterior median of `beta` should be reasonably close to the true value.

```{r}
ggplot(fit_continuous) +
  geom_point(aes(x = beta_true, y = median)) +
  geom_abline(intercept = 0, slope = 1) +
  theme_gray(16)
```

We should also check convergence diagnostics. `rhat` should ideally be close to 1.

```{r}
ggplot(fit_continuous) +
  geom_histogram(aes(x = rhat), bins = 20)
```

Effective sample size should ideally be high.

```{r}
ggplot(fit_continuous) +
  geom_histogram(aes(x = ess_bulk), bins = 20)
```

```{r}
ggplot(fit_continuous) +
  geom_histogram(aes(x = ess_tail), bins = 20)
```

## Discrete covariate

```{r}
tar_load(fit_discrete)
```

Here the analogous results for the discrete covariate simulations.

```{r, paged.print = FALSE}
fit_discrete
```

```{r}
mean(fit_discrete$cover_beta)
```

The posterior median of `beta` should be reasonably close to the true value.

```{r}
ggplot(fit_discrete) +
  geom_point(aes(x = beta_true, y = median)) +
  geom_abline(intercept = 0, slope = 1) +
  theme_gray(16)
```

We should also check convergence diagnostics. `rhat` should ideally be close to 1.

```{r}
ggplot(fit_discrete) +
  geom_histogram(aes(x = rhat), bins = 20)
```

Effective sample size should ideally be high.

```{r}
ggplot(fit_discrete) +
  geom_histogram(aes(x = ess_bulk), bins = 20)
```

```{r}
ggplot(fit_discrete) +
  geom_histogram(aes(x = ess_tail), bins = 20)
```