Merge pull request #106 from The-Motor-Unit/more-docs

Final edits to readme and notebook, and standardization of docstrings
The-Motor-Unit · Jun 29, 2022 · 6112650 · 6112650
2 parents 68db3bc + f2ca67e
commit 6112650
Show file tree

Hide file tree

Showing 10 changed files with 245 additions and 191 deletions.
diff --git a/README.md b/README.md
@@ -6,6 +6,63 @@
 
 A package for decomposing multi-channel intramuscular and surface EMG signals into individual motor unit activity based off the blind source algorithm described in [`Negro et al. (2016)`](https://iopscience.iop.org/article/10.1088/1741-2560/13/2/026027/meta).
 
+## Table of Contents
+
+- [Overview](#overview)
+- [Project Directory](#project-directory)
+- [Proposal and Final Report](#proposal-and-final-report)
+- [Installation](#installation)
+- [Usage](#usage)
+- [Contributing](#contributing)
+- [License](#license)
+- [Credits](#credits)
+
+## Overview
+
+### What's Been Accomplished
+
+An open-source Python package, `EMGdecomPy` containing two elements, a blind source separation algorithm based on the work of [`Negro et al. (2016)`](https://iopscience.iop.org/article/10.1088/1741-2560/13/2/026027/meta) and a visualization element, has been created to decompose raw EMG signals into its constituent motor unit activity. Experimental durations of any length can be run using `EMGdecomPy`.
+
+The blind source separation algorithm has been modified slightly from [`Negro et al. (2016)`](https://iopscience.iop.org/article/10.1088/1741-2560/13/2/026027/meta). The initialization process of the separation vectors has been changed so that instead of initializing every separation vector with the same time instance of highest activity in the pre-processed data, each subsequent vector is initialized with the next highest activity time instance in the pre-processed data.
+
+More customization of the decomposition process is also allowed through different arguments to the `decomposition` function. For example, the separation vectors can be orthogonalized against each other using either the 'source deflation' process described in [`Negro et al. (2016)`](https://iopscience.iop.org/article/10.1088/1741-2560/13/2/026027/meta) or the Gram-Schmidt method.
+
+We have not had the chance to thoroughly validate our algorithm but preliminary results look promising, as 3 out of 5 of the MUAP shapes identified by `EMGdecomPy` were also identified by [`Hug et al. (2021)`](https://figshare.com/articles/dataset/Analysis_of_motor_unit_spike_trains_estimated_from_high-density_surface_electromyography_is_highly_reliable_across_operators/13695937) for the **Gastrocnemius lateralis** muscle with 10% contraction intensity.  Refer to the [documentation](https://emgdecompy.readthedocs.io/en/latest/autoapi/emgdecompy/decomposition/index.html#emgdecompy.decomposition.decomposition) and the [final report](https://github.com/UBC-SPL-MDS/emg-decomPy/blob/main/docs/final-report/final-report.pdf) for more information.
+
+The visualization element allows the user to interactively visualize the results of the blind source separation algorithm. The user can visualize one motor unit at a time from the motor units that were extracted from the EMG data using the algorithm. The visualization includes four plots, the instantaneous firing rate vs time, the signal vs time, an overlayed version of both the previous plots, and the average motor unit action potential shapes per channel. For a better idea of the interactivity of the plot, refer to the [`EMGdecomPy` workflow notebook](https://github.com/The-Motor-Unit/EMGdecomPy/blob/main/notebooks/emgdecompy-worfklow.ipynb).
+
+### What's Not Working
+
+Currently, the blind source separation algorithm accepts multiple motor units of the same shape. Upon inspection, it can be seen that many of these motor units have the exact same firing times or are time lagged from each other. Solutions to these problems are still in development, and include adding an orthogonalization step to the `refinement` function to stop the refinement process from converging on previous motor units and not accepting motor units whose firing times are within a certain time frame as another motor unit.
+
+There is also a bug in the visualization component that does not allow the user to visualize the results of a decomposition if only one motor unit is accepted. This bug is due to how the peak shapes are created and a fix is currently in development.
+
+### Future Work
+
+Future work includes fixing the aforementioned problems, increasing code efficiency, improving the accuracy of the algorithm using domain knowledge, and further quantitative/qualitative validation of the results of the algorithm using the data from [`Hug et al. (2021)`](https://figshare.com/articles/dataset/Analysis_of_motor_unit_spike_trains_estimated_from_high-density_surface_electromyography_is_highly_reliable_across_operators/13695937) and other EMG data sources.
+
+A further improvement to the algorithm would be a re-learning feature. The user would run the algorithm on a sample of the data, and then identify inaccurate firing times (false positives) based on physiological limits of motor unit firing rates. Then the algorithm would use this information to no longer make similar mistakes in the rest of the decomposition. Implementing this feature would be quite complex because it is algorithmically unclear how this would be done.
+
+One idea is to somehow change the initialization of the separation vectors so that they no longer identify the false firing times when applied to the pre-processed data. However, since the separation vector changes throughout the LCA and refinement processes, it would be hard to control the effect that this would have on the estimated firing times. Another approach would be to influence the KMeans algorithm so that the threshold for the small peaks cluster includes the false positive peaks, in the hopes that future peaks of similar size are also false positives. The downside to this approach would be that we may increase the number of incorrect identifications of large peaks as small peaks, which are discarded.
+
+An improvement to the visualization related to the above improvement would be the ability to remove peaks with a click of a button. This improvement is already in progress and if the re-learning feature is implemented then these two features can be connected.
+
+## Project Directory
+
+- [.githubworkflows](https://github.com/The-Motor-Unit/EMGdecomPy/tree/main/.github/workflows)
+  - Contains file for automated testing and publishing of package.
+- [data](https://github.com/The-Motor-Unit/EMGdecomPy/tree/main/data)
+  - Contains a "raw" subdirectory with the EMG data corresponding to the **Gastrocnemius lateralis** muscle with 10% contraction intensity EMG data from [`Hug et al. (2021)`](https://figshare.com/articles/dataset/Analysis_of_motor_unit_spike_trains_estimated_from_high-density_surface_electromyography_is_highly_reliable_across_operators/13695937).
+  - In the future can contain subdirectories pertaining to results from the blind source separation algorithm.
+- [docs](https://github.com/The-Motor-Unit/EMGdecomPy/tree/main/docs)
+  - Contains files related to the final report, the proposal, and the `ReadtheDocs` documentation.
+- [notebooks](https://github.com/The-Motor-Unit/EMGdecomPy/tree/main/notebooks)
+  - Contains a Jupyter notebook with the well-documented reproducible workflow that can be used to apply `EMGdecomPy` on EMG data and/or as a guide on how to use the package.
+- [src](https://github.com/The-Motor-Unit/EMGdecomPy/tree/main/src/emgdecompy)
+  - Contains the `.py` scripts containing `EMGdecomPy` source code.
+- [tests](https://github.com/The-Motor-Unit/EMGdecomPy/tree/main/tests)
+  - Contains the tests for the functions within `src`.
+
 ## Proposal and Final Report
 
 To generate the proposal and final report locally, ensure that you have R version 4.1.2 or above installed, as well as the RStudio IDE. Then install the necessary dependencies with the following commands:
@@ -23,7 +80,9 @@ Our project proposal can be found [here](https://github.com/UBC-SPL-MDS/emg-deco
 
 To generate the proposal locally, run the following command from the root directory after cloning `EMGdecomPy`:
 
-```Rscript -e "rmarkdown::render('docs/proposal/proposal.Rmd')"```
+```
+Rscript -e "rmarkdown::render('docs/proposal/proposal.Rmd')"
+```
 
 Alternatively, if the above doesn't work, install Docker. While Docker is running, run the following command from the root directory after cloning `EMGdecomPy`:
 
@@ -37,7 +96,9 @@ Our final report can be found [here](https://github.com/UBC-SPL-MDS/emg-decomPy/
 
 To generate the final report locally, run the following command from the root directory after cloning `EMGdecomPy`:
 
-```Rscript -e "rmarkdown::render('docs/final-report/final-report.Rmd')"```
+```
+Rscript -e "rmarkdown::render('docs/final-report/final-report.Rmd')"
+```
 
 Alternatively, if the above doesn't work, install Docker. While Docker is running, run the following command from the root directory after cloning `EMGdecomPy`:
 
@@ -55,7 +116,7 @@ pip install emgdecompy
 
 ## Usage
 
-After installing emgdecompy, refer to the [`EMGdecomPy` workflow notebook](https://github.com/UBC-SPL-MDS/EMGdecomPy/blob/main/notebooks/emgdecompy-worfklow.ipynb) for an example on how to use the package, from loading in the data to visualizing the decomposition results.
+After installing emgdecompy, refer to the [`EMGdecomPy` workflow notebook](https://github.com/UBC-SPL-MDS/EMGdecomPy/blob/main/notebooks/emgdecompy-worfklow.ipynb) for an example on how to use the package, from loading in the data to visualizing the decomposition results. Clone and run the notebook locally to view and interact with the visualization.
 
 ## Contributing
 

diff --git a/docs/final-report/final-report.pdf b/docs/final-report/final-report.pdf
diff --git a/docs/final-report/methods.Rmd b/docs/final-report/methods.Rmd
@@ -72,9 +72,9 @@ $$P_{\text{Noise}} = \text{Power of noise}$$
 
 Once the refinement process is done, the refined separation vector is accepted based on a user-defined threshold of either the silhouette score between the signal and the noise or the pulse-to-noise ratio. The silhouette score is defined in **\@ref(eq:sil)** and is calculated using the signal and noise clusters in the MUSTs [@negro_muceli_castronovo_holobar_farina_2016]. The pulse-to-noise ratio is defined in **\@ref(eq:pnr)**. The accepted separation vectors correspond to the MUs that the blind source separation algorithm extracts from the raw signal.
 
-A further improvement to the algorithm that we did not have time to implement would be a re-learning feature. The user would run the algorithm on a sample of the data, and then identify inaccurate inaccurate firing times (false positives) based on physiological limits of MU firing rates. The algorithm would use this information to no longer make similar mistakes in the rest of the decomposition. Implementing this feature would be quite complex because it is unclear how this would be implemented programmatically.
+A further improvement to the algorithm that we did not have time to implement would be a re-learning feature. The user would run the algorithm on a sample of the data, and then identify inaccurate firing times (false positives) based on physiological limits of MU firing rates. The algorithm would use this information to no longer make similar mistakes in the rest of the decomposition. Implementing this feature would be quite complex because it is unclear how this would be programmed into the algorithm.
 
-The stakeholders affected by our blind source separation algorithm are researchers and those that would be affected by their research. This is why our algorithm must work properly so that researchers' results are accurate and do not affect the general public adversely down the line. For example, if someone uses `EMGdecomPy` and obtains inaccurate results which are  used to inform a neuromuscular diagnosis, it could greatly affect someone's life. Periodically, the results of our algorithm should be compared to others to obtain a second opinion on the decomposition of the EMG signal. 
+The stakeholders affected by our blind source separation algorithm are researchers and those that rely on their research. This is why our algorithm must work properly so that researchers' results are accurate and do not impact the general public adversely down the line. For example, if someone uses `EMGdecomPy` and obtains inaccurate results which are used to inform a neuromuscular diagnosis, it could greatly affect someone's life. Periodically, the results of our algorithm should be compared to others to obtain a second opinion on the decomposition of the EMG signal. 
 
 The data provided by @Hug2021 contains EMGs and their decomposition results from many different muscles at different voluntary muscle contraction intensities (100% being the most intensely the subject can contract their muscle). We have not had the chance to thoroughly validate our algorithm using this data, as the debugging process took a great deal of time. We have only received qualitative results, obtained by visually comparing MUAP shapes identified by `EMGdecomPy` and those @Hug2021 identified using `DEMUSE`, a commercial software created by @holobar_2016. There are concerns with this approach as `DEMUSE` uses a similar but different algorithm than @negro_muceli_castronovo_holobar_farina_2016. `DEMUSE` is a highly used software in comparison to `OTBioLab+`, and therefore the partner wishes to compare our results to theirs.