Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
  • Loading branch information
LindaAmoafo committed Mar 21, 2023
1 parent 38c045c commit 5610c14
Show file tree
Hide file tree
Showing 15 changed files with 311 additions and 6 deletions.
4 changes: 3 additions & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -14,11 +14,13 @@ Imports:
stats
Encoding: UTF-8
Roxygen: list(markdown = TRUE)
Suggests:
Suggests:
kableExtra,
knitr,
mvtnorm,
rmarkdown,
testthat (>= 3.0.0)
RoxygenNote: 7.2.3
Config/testthat/edition: 3
VignetteBuilder: knitr
LazyData: true
9 changes: 5 additions & 4 deletions R/BayesSPM.R
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
#' -SPMBayes-
#'
#' An R-package for conducting Bayesian SPM Analysis
#'
#' @return A list containing summaries(mean and SD) of the data at each time point with the posterior probability threshold and q-value threshold
#' @param Y (J x T) array specifying observations for T time points.
#' @param group (J x 1) factor vector specify the groups to which each observation belongs.
#' @param time (T x 1) vector specify the times (% stance)
Expand Down Expand Up @@ -142,9 +142,9 @@ plot.SPMBayes <- function(x, summary = TRUE, ...) {
ggplot2::geom_hline(yintercept = PPq, linewidth = 0.5, linetype = "dashed") +
ggplot2::geom_hline(yintercept = threshold, linewidth = 0.5, linetype = "dashed") +
ggplot2::geom_hline(yintercept = Q.threshold, linewidth = 0.5, linetype = "dashed") +
ggplot2::annotate("text", x = 10, y = Q.threshold-0.01, label = paste0("Q <", round(Q.threshold,2))) +
ggplot2::annotate("text", x = 10, y = threshold+0.02, label = paste("P(H1 | data) > ", threshold, sep="")) +
ggplot2::annotate("text", x = 10, y = 0.00, label = paste("P(H0 | data) > ", threshold, sep="")) +
ggplot2::annotate("text", x = 30, y = PPq-0.05, label = paste0("Q <", round(Q.threshold,2))) +
ggplot2::annotate("text", x = 30, y = threshold+0.02, label = paste("P(H1 | data) > ", threshold, sep="")) +
ggplot2::annotate("text", x = 30, y = 0.00, label = paste("P(H0 | data) > ", threshold, sep="")) +
ggplot2::geom_ribbon(data = subset(SUMMARY, tH1c), ggplot2::aes(ymin = threshold, ymax = PPH1),
fill = "gray", alpha = 0.4) +
ggplot2::geom_ribbon(data = subset(SUMMARY, tH1c), ggplot2::aes(ymin = PPq, ymax = PPH1),
Expand All @@ -158,3 +158,4 @@ plot.SPMBayes <- function(x, summary = TRUE, ...) {
}

}

24 changes: 24 additions & 0 deletions R/GaitSymmetry.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
#' GaitSymmetry data
#'
#' Data from Single subject dataset (healthy male, 28
#' years, 83 kg, 178 cm) of left and right leg knee
#' flexion angles during 99 gait cycles on a dual-belt
#' tredmill at constant speed of 4.5 km/h, time
#' normalized to 101 time samples. Gait kinematics
#' were recorded with a 6-camera VICON system at 250Hz.
#'
#' @docType data
#' @param GaitSymmetry The data for GaitSymmetry
#'
#' @usage data(GaitSymmetry)
#'
#' @format An object of class \code{SPMBayes}
#' @keywords datasets
#'
#'
#'
#' @examples
#' head(data(GaitSymmetry))
"GaitSymmetry"


18 changes: 18 additions & 0 deletions R/PlantarArchAngle.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
#' PlantarArchAngle data
#'
#' Dataset of 2 x n = 10 experimental time series of the plantar arch
#' angle of the foot at 101 time samples each
#'
#' @docType data
#' @param PlantarArchAngle The data for PlantarArchAngle
#' @usage data(PlantarArchAngle)
#'
#' @format An object of class \code{SPMBayes}
#' @keywords datasets
#'
#'
#' @examples
#' head(data(PlantarArchAngle))
"PlantarArchAngle"


18 changes: 18 additions & 0 deletions R/SimulatedTwoLocalMax.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
#' SimulatedTwoLocalMax data
#'
#' Dataset of 2 x n = 10 experimental time series of the plantar arch
#' angle of the foot at 101 time samples each
#'
#' @docType data
#' @param SimulatedTwoLocalMax The data for SimulatedTwoLocalMax
#'
#' @usage data(SimulatedTwoLocalMax)
#'
#' @format An object of class \code{SPMBayes}
#' @keywords datasets
#'
#'
#'
#' @examples
#' head(data(SimulatedTwoLocalMax))
"SimulatedTwoLocalMax"
Binary file added data/GaitSymmetry.RData
Binary file not shown.
Binary file added data/PlantarArchAngle.RData
Binary file not shown.
Binary file added data/SimulatedTwoLocalMax.RData
Binary file not shown.
27 changes: 27 additions & 0 deletions man/GaitSymmetry.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

23 changes: 23 additions & 0 deletions man/PlantarArchAngle.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 3 additions & 0 deletions man/SPMBayes.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

23 changes: 23 additions & 0 deletions man/SimulatedTwoLocalMax.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

33 changes: 33 additions & 0 deletions vignettes/reference.bib
Original file line number Diff line number Diff line change
Expand Up @@ -13,3 +13,36 @@ @misc{spm1d
title = {{one-dimensional Statistical Parametric Mapping}},
howpublished = "\url{https://spm1d.org/#/}"
}

@article{serrien2019statistical,
title={Statistical parametric mapping of biomechanical one-dimensional data with Bayesian inference},
author={Serrien, Ben and Goossens, Maggy and Baeyens, Jean-Pierre},
journal={International Biomechanics},
volume={6},
number={1},
pages={9--18},
year={2019},
publisher={Taylor \& Francis}
}

@article{kall2008non,
title={Non-parametric estimation of posterior error probabilities associated with peptides identified by tandem mass spectrometry},
author={K{\"a}ll, Lukas and Storey, John D and Noble, William Stafford},
journal={Bioinformatics},
volume={24},
number={16},
pages={i42--i48},
year={2008},
publisher={Oxford University Press}
}

@article{pataky2010generalized,
title={Generalized n-dimensional biomechanical field analysis using statistical parametric mapping},
author={Pataky, Todd C},
journal={Journal of biomechanics},
volume={43},
number={10},
pages={1976--1982},
year={2010},
publisher={Elsevier}
}
131 changes: 131 additions & 0 deletions vignettes/vignette.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
---
title: "Using BayesSPM to conduct Statistical parametric mapping of bio-mechanical data with Bayesian inference"
author: "Linda Amoafo"
date: 3/16/2023
output:
rmarkdown::html_vignette:
toc: true
bibliography: reference.bib
vignette: >
%\VignetteIndexEntry{Using BayesSPM to conduct Statistical parametric mapping of bio-mechanical data with Bayesian inference}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
library(kableExtra)
```

# **Introduction**

Statistical Parametric Mapping (SPM) was designed for statistical inference on dependent variables sampled from a large number of spatially correlated voxels (volume elements) [@friston2007short]. The biomechanics research community has expressed a keen interest in recent advancements in Statistical Parametric Mapping (SPM) for continuum data, such as kinematic time series. The introduction of SPM into biomechanical literature was facilitated by T. Pataky's Python/MATLAB package, [@spm1d], which was originally adapted from neuroimaging. The package allows frequentist statistical analyses that are commonly used in biomechanics. The BayesSPM package proposes the use of Bayesian analogs of SPM, which are based on Bayes factors and posterior probability with default priors, using the BayesFactor package in R. The package presents results for t test (two-sample and paired sample t-tests) as well ANOVA (for multiple group testing). This vignette is motivated by the 2 sample t-test Bayesian SPM work by [@serrien2019statistical]

# **Classical SPM**

To prevent an inflation of Type I error, Statistical Parametric Mapping (SPM) uses Random Field Theory (RFT) instead of conducting separate inferential tests at each time point. RFT makes use of smoothness, which refers to the local correlation between adjacent time points, to address the multiple testing problem. This approach provides accurate control of Type I errors, regardless of the sampling rate, when testing correlated field data. In the context of biomechanics, neighboring time samples are not independent, and this needs to be considered [@pataky2010generalized]. Instead of computing a p-value at each time sample, RFT calculates a p-value for clusters of statistics (e.g., t) that cross a critical threshold (t*). The rationale behind RFT is that the height and width of suprathreshold clusters generated by smooth random fields are inversely proportional to their probability of occurrence. Therefore, a large suprathreshold cluster is the topological equivalent of a large t-value for 0D data.

The critical thresholds for Statistical Parametric Mapping (SPM) are typically calculated using a Type I error of α = 0.05. As a result, when the observed test-statistic time series crosses the threshold, the cluster has a p-value less than 0.05. This enables the researcher to reject the null hypothesis ($H_0$) that there is no difference between the two time series.

# **BAYESIAN INFERENCE**

The SPM p-values represent the probability of the observed data given that the null hypothesis $(H_0)$ is true, i.e., $P(data | H_0)$, without considering any alternative hypothesis $(H_1)$. However, most researchers would want to know, the probability that $(H_0)$ or $(H_1)$ are good descriptions of the data, ie. $P(H_0 | data), P(H_1 | data)$. Additionally, frequentist inference is asymmetric in the sense that, it is only possible to state evidence against $H_0$ and not evidence in favor of $H_0$ or in favor of any alternative $H_1$.

Bayesian inference allows for statements regarding evidence in favor of the null hypothesis $(H_0)$ or any alternative hypothesis $(H1)$. For example, in a study aimed at demonstrating that gait kinematic time series are left-right symmetric, Bayesian inference can provide quantification of the evidence in favor of $H_0: μ_{left}(t) = μ_{right}(t)$. This is possible due to the incorporation of prior knowledge into the analysis, which allows for the calculation of the probability of both $H_0$ and $H_1$ being true.

## **Bayesian SPM Posterior Probabilty Maps**

In this package, we rely on the Bayes factor as an approach to producing SPM posterior probability maps at each voxel/ time point.

## **Posterior Probability Threshold**

Users are allowed to set a posterior probability to control for false discovery rate of their choice. This posterior probability serve as the analogous supra-cluster threshold in the frequentist setting. eg, set posterior probability of 0.95 to control for a 5% false discovery rate, equivalent to $\alpha=0.05$.

## **Q-value Threshold (Less conservative threshold)**
A less conservative threshold, while still keeping the FDR at the same level is the use of the q-value which is defined as the cumulative mean of the posterior error probabilities [@kall2008non].


## Examples
Examples in this vignette is inspired by the project data used by [@serrien2019statistical], which are example datasets that come included with the open-source spm1d-package (spm1d.org, [@spm1d]) and one additional dataset from their lab.

```{r, echo=F}
dat.describe <- data.frame(`Statistical Test`= c("Two-sample t-test",
"Paired-sample t-test", "Paired sample t-test"), Description = c("SimulatedTwoLocalMax. Dataset of 2 x n = 6 simulated
time series of 101 time samples each. The first set are
smooth unit Gaussian random trajectories.
The second set are also smooth unit Gaussian
trajectories, but with bursts at t = 25 and t = 75.",
"PlantarArchAngle (Caravaggi et al., 2010). Dataset of 2
x n = 10 experimental time series of the plantar arch
angle of the foot at 101 time samples each.",
"GaitSymmetry. Single subject dataset (healthy male, 28
years, 83 kg, 178 cm) of left and right leg knee
flexion angles during 99 gait cycles on a dual-belt
tredmill at constant speed of 4.5 km/h, time
normalized to 101 time samples. Gait kinematics
were recorded with a 6-camera VICON system at 250
Hz.") )
kable(dat.describe)
```

## Example1
For our first example, we would use the SimulatedTwoLocalMax data (description above)

first install the BayesSPM package and call the data which comes with this package
```{r}
#install.packages("BayesSPM")
library(BayesSPM)
data(SimulatedTwoLocalMax)
head(SimulatedTwoLocalMax,2)
```
The first column shows the group indicator and the rest are outputs

Next we get a summary plot of the data
```{r, warning=F, message=F, fig.height=5, fig.width=8}
Y <- PlantarArchAngle[,-1]
group <- PlantarArchAngle[,1]
time=1:ncol(Y)
out <- SPMBayes(Y=Y, group=group, time=time, threshold = 0.95, Hypothesis = "alt", paired = FALSE)
plot(x=out, summary = T)
```

Setting summary = FALSE provides an SPM{P} Maps
```{r, fig.height=5, fig.width=8}
plot(x=out, summary = F)
```

SPM{P} MAP shows evidence against the null at t = 24–27 and at t = 77


```{r, fig.height=5, fig.width=8}
Y <- SimulatedTwoLocalMax[,-1]
group <- SimulatedTwoLocalMax[,1]
time=1:ncol(Y)
out <- SPMBayes(Y=Y, group=group, time=time, threshold = 0.95, Hypothesis = "alt", paired = FALSE)
plot(x=out, summary = T)
```
```{r, fig.height=5, fig.width=8}
plot(x=out, summary = F)
```
In both examples H0 may be rejected at a type I error rate of 5%.
The less conservative q* = 0.05 threshold always yields broader clusters than the P(H | data)* = 0.95 threshold, but in these exapmles the difference between both thresholds is small.


```{r, include=F, echo=F, eval=F, fig.height=5, fig.width=8}
Y <- GaitSymmetry[,-1]
group <- GaitSymmetry[,1]
time=1:ncol(Y)
out <- SPMBayes(Y=Y, group=group, time=time, threshold = 0.95, Hypothesis = "Null", paired = FALSE)
plot(x=out, summary = T)
```

```{r, include=F, echo=F, eval=F}
plot(x=out, summary = F)
```
4 changes: 3 additions & 1 deletion vignettes/vignette.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,9 @@
title: "Using BayesSPM to conduct Statistical parametric mapping of bio-mechanical data with Bayesian inference"
author: "Linda Amoafo"
date: 3/14/2023
format: gfm
format:
html:
toc: true
bibliography: reference.bib
editor: visual
vignette: >
Expand Down

0 comments on commit 5610c14

Please sign in to comment.