📝 grammars are the worst™

bios2 · graciellehigino · Jun 17, 2021 · Jun 18, 2021 · Jun 29, 2021 · Jun 29, 2021
commit 427c8fc615771e0cf3a7880308698fc233093522
diff --git a/_posts/2021-06-13-unreproducibility-detox/unreproducibility-detox.Rmd b/_posts/2021-06-13-unreproducibility-detox/unreproducibility-detox.Rmd
@@ -568,7 +568,7 @@ Do you already have all your manuscripts in a reproducible format? Congratulatio
 
 ## Why do we need to preserve our tools?
 
-So you've commented, documented, and shared your code meaning that it's ready to be used by the rest of the world, right? Well maybe for now but you know what they say about time - *all hours wound; the last one kills*. Okay so it might not be that dramatic but there is of course the problem that as time progresses our code becomes out-dated and (worst case scenario) non-functional. Programming languages (and packages) are continually evolving as developers work at squashing bugs and making performance upgrades. Sometimes these upgrades might result in a fundamental change in how the a language or package functions _e.g._ a function name might change or some functionality will be removed in favour of another. This means that in a few years that beautifully documented chunk of code that we've written today might not even run. 
+So you've commented, documented, and shared your code meaning that it's ready to be used by the rest of the world, right? Well maybe for now but you know what they say about time - *all hours wound; the last one kills*. Okay that was maybe a bit dramatic but there is of course the problem that as time progresses and langages/packages are updated our code becomes out-dated and (worst case scenario) non-functional. Programming languages (and packages) are continually evolving as developers work at squashing bugs and making performance upgrades and sometimes these upgrades might result in a fundamental change in how the a language or package functions _e.g._ a function name might change or some functionality will be removed in favour of another. This means that in a few years that beautifully documented chunk of code that we've written today might not even run. 
 
 Oh dear...
 
@@ -578,23 +578,23 @@ Oh dear...
 
 </center>
 
-What this boils down to is that we need to not only think about documenting the code itself but also all the 'backend' features that make it tick _i.e._ not only what packages we're using but also what version. This can also extend to language and operating system (OS) type or version used. 
+What this boils down to is that we need to not only think about documenting the code itself but also all the 'backend' features that make it tick _i.e._ not only what packages we're using but also what versions. In the bigger scheme of things this should also extend to the version of the langauge you are using and even the OS (operating system) 
 
 Although this may seem daunting it's important to remember that the journey to 
 reproducibility is much like how one approaches eating an elephant - we take 
 it one bit~~e~~ at a time. So don't be afraid to take a little nibble before biting off more than you can chew.
 
 ## How do we _keep_ our work reproducible?
 
-The good news is that there is a lot of functionality out there to help us on our reproducibility journey. Different languages have different ways we can document and 'keep' the package version that we are using. The main focus will be using `R` as it is the current *lingua franca* of most ecologists and it also straddles the middle ground between being very 'picky' like `python` and literally having a built in system like `Julia`. 
+The good news is that there is a lot of functionality out there to help us on our reproducibility journey. Different languages have different ways we can document and 'keep' the package version that we are using. The main focus will be using `R` as it is the current *lingua franca* of most ecologists and it also straddles the middle ground between being very 'picky' like `python` and literally having a built in (although not always perfect) system like `Julia`. 
 
 The big (language agnostic) take home message here though is that it's important to (at minimum) keep record of the versions of things you used if you want your work to work a few months/years down the line. By keeping a record of the package, software and OS versions used we give other users (and our future selves) a chance to recreate the environment that allowed our project/code to run should things change or be updated. 
 
-The three main approaches and packages I will discuss are `{groundhog}`, `{renv}` and, `docker`. There are of course other ways to document package versions but these are (somewhat user friendly) and will give you different 'levels' of reproducibility. It is of course also possible to mix and match these different platforms.
+The three main approaches and packages we will discuss are `{groundhog}`, `{renv}` and, `docker`. There are of course other ways to document package versions but these are (somewhat) user friendly and will give you different 'levels' of reproducibility. It is of course also possible to mix and match these different platforms. SO lets dtart from the bottom and work our way up:
 
 ### `{groundhog}`
 
-[`{groundhog}`](http://groundhogr.com/using/) is a relatively new kid on the block -and apparently refers to a film of the same name (no comment on my side as this is a facet of pop culture the eludes me). This is a super easy package to implement (think one function easy) and is a really nice way to 'retrofit' some of your older code.
+[`{groundhog}`](http://groundhogr.com/using/) is a relatively new kid on the block - and apparently refers to a film of the same name (no comment on my side as this is a facet of pop culture the eludes me). This is a super easy package to implement (think one function easy) and is a really nice way to 'retrofit' some of your older code.
 
 **How it works:** Essentially `{groundhog}` will install the version of a package that was available on CRAN for a specified date. This is done by 'replacing' the `library("package")` with `groundhog.library("package", date)`. This means its easy to go back and set a more suitable date for your script e.g. maybe the date it was created or last time it was saved.
 
@@ -615,39 +615,39 @@ groundhog.library(pkgs, groundhog.day)
 
 ```
 
-**Limitations:** Although `{groundhog}` will call the correct/desired packages version there is of course the potential problem that that package version is no longer compatible with the version of `R` that you're running on your machine --- this means you might have to have multiple version of `R` on you machine and have to switch between them depending on what project you're using. Another issue could arise when retrofitting your workflow. Although you might have a starting date/groundhog day you might not have been using the most up-to-date version available at that date - so you would be retrieving the wrong version.
+**Limitations:** Although `{groundhog}` will call the correct/desired packages version there is of course the potential problem that that package version is no longer compatible with the version of `R` that you're running on your machine --- this means you might have to have multiple version of `R` on you machine and have to switch between them depending on what project you're using. Another issue could arise when retrofitting your workflow. Although you might have a starting date/groundhog day you might not have been using the most up-to-date version available at that date - so you would still be retrieving the wrong version.
 
 **Pros:** To end on a positive note though - {groundhog} is at least a solid starting point for documenting package version _and_ its very easy to implement, especially if you are retrofitting your code.
 
 ### `{renv}`
 
-As highlighted above one of the potential issues with {groundhog} is that you might run into language version incompatibility - and by extension still have non-working code (bleak). Enter [`{renv}`](https://rstudio.github.io/renv/articles/renv.html), a handy-dandy, easy to use, dependency management package for your projects. `{renv}` records both `R` and package versions through a series of user called functions. This is very similar to `Julia` where all packages are 'stored' in `Project.toml`. `{renv}` works by crawling through your project directory and recording package version and dependencies in use. This is then saved in the `renv.lock` file and is used to 'load' the project state further down the line.
+As highlighted above one of the potential issues with {groundhog} is that you might run into language version incompatibility - and by extension still have non-working code (bleak). Enter [`{renv}`](https://rstudio.github.io/renv/articles/renv.html), a handy-dandy, easy to use, dependency management package for your projects. `{renv}` records both the `R` and package versions through a series of user called functions. This is very similar to `Julia` where all packages are 'stored' in `Project.toml`. `{renv}` works by crawling through your project directory and recording package version and dependencies in use. This is then saved in the `renv.lock` file and is used to 'load' the project state further down the line.
 
 **How it works:** The bare bones overview is that you 1) initialise the project-local environment using `renv::init()`, 2) continue tinkering as you go, 3) call `renv::snapshot()` to update `renv.lock` with any new additions, and 4) if things broke along the way you can call `renv::restore()` to revert back to the previous project state you had saved in your lock file (which hopefully did run).
 
-**Limitations:** One limitation is that `{renv}` relies on you saving a _currently_ working/functioning state (if you want recall it and have it to work in the future). This makes it a bit tricky to try and quickly 'fix' old code - something that `{groundhog}` is probably more suited for, whereas `{renv}` is a solid choice when starting a new project form scratch.
+**Limitations:** One limitation is that `{renv}` relies on you saving a _currently_ working/functioning state (if you want to recall it and have it work in the future). This makes it a bit tricky to try and quickly 'fix' old code - something that `{groundhog}` is probably more suited for, whereas `{renv}` is a solid choice when starting a new project form scratch.
 
-**Pros:** `{renv}` saves both package and `R` versions - which is great as it 'doubles down' on having things work in harmony. It is also very easy to use - once again you can get away by using a few lines of code. 
+**Pros:** `{renv}` saves both package and `R` versions - which is great as it 'doubles down' on having things work in harmony. It is also very easy to use - once again you can get away by using a few lines of code. This makes it a really useful tool to try and make an unconsious part of your day to day coding workflow.
 
 ### Docker
 
-Docker, a term that can strike trepidation in even some of the most hardened of researchers (although they have the cutest whale as a logo and that 100% drops the scary factor if you as me). Briefly Docker is  a program that allows you to host different mini computers on your computer. This of course means its not just an R-specific tool but one that could probably cover a lot of reproducibility bases for most languages. But there is a reason this is last on the list and that is because it takes a bit more work to implement. So think of this as a long-term project/goal to set yourself up for.
+Docker, a term that can strike trepidation in even some of the most hardened of researchers (although they have the cutest whale as a logo and that 100% drops the scary factor if you as me). Briefly Docker is a program that allows you to host what are essentially different mini computers on your computer. This of course means its not just an R-specific tool but one that can cover a lot of reproducibility bases for most languages. But there is a reason this is last on the list and that is because it takes a bit more work to implement. So think of this as a long-term project/goal to set yourself up for.
 
-**How it works:** As I said earlier with Docker you can run multiple mini computers (containers) built from an 'image' of your machine (the host). The catch though - you need to build the image from scratch from OS all the way through to you specific script/code chunk. These build instructions are contained in a `Dockerfile` - which you save in your working directory. Inside this file is the 'recipe' for building your image (and spoiler alert it looks a lot like a series of command line calls). Colin Fay wrote [this](https://colinfay.me/docker-r-reproducibility/) really nice blog about using docker and `R` for beginners. If your interested I suggest starting there! Alternatively `{renv}` also plays well with Docker - have a look at [this vignette](https://rstudio.github.io/renv/articles/docker.html)
+**How it works:** As mentioned earlier with Docker you can run multiple mini computers (containers) built from an 'image' of your machine (the host). The catch though - you need to build the image from scratch from OS all the way through to your specific script/code chunk. These build instructions are contained in a `Dockerfile` - which you save in your working directory. Inside this file is the 'recipe' for building your image (and spoiler alert it looks a lot like a series of command line calls). Colin Fay wrote [this](https://colinfay.me/docker-r-reproducibility/) really nice blog about using docker and `R` for beginners. If you're interested I suggest starting there! Alternatively `{renv}` also plays well with Docker - have a look at [this vignette](https://rstudio.github.io/renv/articles/docker.html)
 
-**Limitations:** In the context of what has been discussed in this post Docker is _hard_ yo! In order to write a Docker file you will benefit a lot from being comfortable using and thinking of things in terms of command line. Since you are 'creating' you mini computer you need to install a lot of moving parts and components. This means you might be moving from your comfort zone when it comes to programming and could put you off trying the whole reproducibility thing all together. So set realistic expectations here and don't be too hard on yourself!
+**Limitations:** In the context of what has been discussed in this post Docker is _hard_ yo! In order to write a Docker file you will benefit a lot from being comfortable using and thinking of things in terms of command line. Since you are 'creating' you mini computer you need to install a lot of moving parts and components. This means you might be moving from your comfort zone when it comes to programming, which could put you off trying the whole reproducibility thing all together. So set realistic expectations here and don't be too hard on yourself!
 
-**Pros:** Docker is very flexible! You can build your mini computer to your specifications and keep your 'normal computer' intact. For example if I am running MacOS, `R` 3.5 on my normal computer but can build an image that runs Linux and `R` 3.1. Also because the recipe is contained in the `Dockerfile` anyone can build the image for that project on their machine and have it all 'just' work (avoiding the whole 'but it works on my machine' scenario).
+**Pros:** Docker is very flexible! You can build your mini computer to your specifications and keep your 'normal computer' intact. For example if I am running MacOS, `R` 3.5 on my normal computer I can also build an image that runs Linux and `R` 3.1. Also because the recipe is contained in the `Dockerfile` anyone can build the image for that project on their machine and have it all 'just' work (avoiding the whole 'but it works on my machine' scenario).
 
 ## Closing thoughts
 
 If you want to keep your project pipeline working in the long-term it is important to account for the fact that languages are evolving - which means the scaffold on which your code rests also needs to be documented in some way. That being said asking yourself as to how _paramount_ the longevity of your project is a good way to identify and allocate resources to documenting and accommodating for this. For smaller projects you could probably get away with a simple documentation process e.g. `Julia`'s `Project.toml` system or `{renv}` for `R`. But if the longevity of the project is of high importance it's probably recommended to give something like Docker a try. 
 
 ## Reproducibility task of the day
 
-First sit down and think about your project and how important longevity is. Do future generations depend on your code being able to run and execute tasks flawlessly? Or it it more important that the workflow is well documented and understood _i.e._ it could be easily be 'translated' to the shiny new programming language people are using?
+First sit down and think about your project and how important its longevity is. Do future generations depend on your code being able to run and execute tasks flawlessly? Or it it more important that the workflow is well documented and understood _i.e._ it could be easily be 'translated' to the shiny new programming language people are using?
 
-Pick and choose the task(s) that you want to take on (or remix one of them)
+Pick and choose the task(s) that you want to take on (or remix) one of them.
 
 1. Open one of the older projects on you computer. Does the code run? If no see if you can retrofit it using {groundhog}