diff --git a/02_getting_started_with_r.qmd b/02_getting_started_with_r.qmd
index bebd44a..aa1f437 100644
--- a/02_getting_started_with_r.qmd
+++ b/02_getting_started_with_r.qmd
@@ -639,11 +639,39 @@ multiply_solutions(a = 1, b = -1, c = -3, multiplier = 10)
All of the objects that we create inside a function will disappear after it finishes running. We can only save the output of the function by assigning it to an object.
:::
+## Acquiring external packages
+
+We don't need to reinvent the wheel every time we need to do something that is not available in **R**'s default version. We can easily download packages from CRAN's online repositories to get many useful functions.
+
+To install a package from CRAN, we can use the `install.packages()` function. For example, if we wanted to install the package `readxl` (for loading .xslx files), we would need:
+```{r installing readxl}
+#| eval: false
+install.packages("readxl", dependencies = TRUE)
+```
+
+The argument `dependencies` tells **R** whether it should also download other packages that `readxl` needs to work.
+
+**R** may ask you to select a CRAN mirror, which simply put refers to the location of the servers you want to download from. Choose a mirror close to where you are.
+
+After installing a package, we need to load it into **R** before we can use its functions. To load the package `readxl`, we need to use the function `library()`, which will also load any other packages required to load `readxl` and may print additional information.
+```{r loading readxl}
+#| eval: false
+library("readxl")
+```
+
+Every time we start a new **R** session we need to load the packages we need. If we try to run a function without loading its package first, we will get an error message saying that **R** could not find it.
+
+Writing all our `library()` statements at the top of our **R** scripts is almost always a good idea. This helps us know that we need to load the libraries at the start our sessions; and it helps others know quickly that they will need to have those libraries installed to be able to use our code.
+
+Sometimes only need one or two functions from a library. To avoid loading the entire library, we can access the specific function directly by specifying the package name followed by two colons and then the function name. For example:
+```{r using specific function from library}
+#| eval: false
+readxl::read_xlsx("fake_data_file.xlsx")
+```
+
## Exercise
Let's try to practice all of the basic features of **[R]{.sans-serif}** that you just learned.
Write a function that can simulate the roll of a pair of six-sided dice (let's call them red and blue) an arbitrary number of times. This function should return a vector with the values of the red die that were strictly larger than the corresponding values of the blue die. Hint: to simulate rolling a die, you can use the function `sample()`.
-
-Bonus: modify your function so that the dice can have different numbers of faces.
\ No newline at end of file
diff --git a/docs/02_getting_started_with_r.html b/docs/02_getting_started_with_r.html
index 5981060..ceccf5f 100644
--- a/docs/02_getting_started_with_r.html
+++ b/docs/02_getting_started_with_r.html
@@ -236,7 +236,8 @@
We don’t need to reinvent the wheel every time we need to do something that is not available in R’s default version. We can easily download packages from CRAN’s online repositories to get many useful functions.
+
To install a package from CRAN, we can use the install.packages() function. For example, if we wanted to install the package readxl (for loading .xslx files), we would need:
+
+
install.packages("readxl", dependencies =TRUE)
+
+
The argument dependencies tells R whether it should also download other packages that readxl needs to work.
+
R may ask you to select a CRAN mirror, which simply put refers to the location of the servers you want to download from. Choose a mirror close to where you are.
+
After installing a package, we need to load it into R before we can use its functions. To load the package readxl, we need to use the function library(), which will also load any other packages required to load readxl and may print additional information.
+
+
library("readxl")
+
+
Every time we start a new R session we need to load the packages we need. If we try to run a function without loading its package first, we will get an error message saying that R could not find it.
+
Writing all our library() statements at the top of our R scripts is almost always a good idea. This helps us know that we need to load the libraries at the start our sessions; and it helps others know quickly that they will need to have those libraries installed to be able to use our code.
+
Sometimes only need one or two functions from a library. To avoid loading the entire library, we can access the specific function directly by specifying the package name followed by two colons and then the function name. For example:
+
+
readxl::read_xlsx("fake_data_file.xlsx")
+
+
+
+
2.10 Exercise
Let’s try to practice all of the basic features of R that you just learned.
Write a function that can simulate the roll of a pair of six-sided dice (let’s call them red and blue) an arbitrary number of times. This function should return a vector with the values of the red die that were strictly larger than the corresponding values of the blue die. Hint: to simulate rolling a die, you can use the function sample().
-
Bonus: modify your function so that the dice can have different numbers of faces.
diff --git a/docs/search.json b/docs/search.json
index 3282347..7d0eb2b 100644
--- a/docs/search.json
+++ b/docs/search.json
@@ -122,8 +122,8 @@
"objectID": "02_getting_started_with_r.html#exercise",
"href": "02_getting_started_with_r.html#exercise",
"title": "2 Getting Started with R",
- "section": "2.9 Exercise",
- "text": "2.9 Exercise\nLet’s try to practice all of the basic features of R that you just learned.\nWrite a function that can simulate the roll of a pair of six-sided dice (let’s call them red and blue) an arbitrary number of times. This function should return a vector with the values of the red die that were strictly larger than the corresponding values of the blue die. Hint: to simulate rolling a die, you can use the function sample().\nBonus: modify your function so that the dice can have different numbers of faces."
+ "section": "2.10 Exercise",
+ "text": "2.10 Exercise\nLet’s try to practice all of the basic features of R that you just learned.\nWrite a function that can simulate the roll of a pair of six-sided dice (let’s call them red and blue) an arbitrary number of times. This function should return a vector with the values of the red die that were strictly larger than the corresponding values of the blue die. Hint: to simulate rolling a die, you can use the function sample()."
},
{
"objectID": "02_getting_started_with_r.html#footnotes",
@@ -319,7 +319,7 @@
"href": "04_basic_data_processing.html#loading-data",
"title": "4 Basic data processing",
"section": "4.1 Loading data",
- "text": "4.1 Loading data\nOnce we know where to find data files in our computer, we can start loading them into R. Note, however, that we need specific ways to open different file formats.\n\n4.1.1 Plain text files\nPlain-text files are simple and many programs can read them. This is why many organizations (e.g., the Census Bureau, the Social Security Administration, etc.) publish their data as plain-text files.\nA plain-text file stores a table of data in a text document. Each row of the table is saved on its own line, and a simple symbol separates the cells within a row. This symbol is often a comma, but it can also be a tab, a pipe delimiter |, or any other character. Each file only uses one symbol to separate cells, which minimizes confusion.\nWe will work with data from this1 plain text file. Use Ctrl+Shift+s to download the file. Then save it in your working directory with the name “flower”.\n\n4.1.1.1 read.table\nread.table() can load plain-text files. The first argument of read.table() is the name of your file (if it is in your working directory), or the file path to your file (if it is not in your working directory).\n\nflower_df <- read.table(\"data_files/flower.csv\", header = TRUE, sep = \",\")\n\nIn the code above, I added two more arguments, header and sep. header tells R whether the first line of the file contains variable names instead of values. sep tells R the symbol that the file uses to separate the cells.\nSometimes a plain-text file starts with text that is not part of the data set. Or, maybe we want to read only part of a data set. Argument skip tells R to skip a specific number of lines before it starts reading in values from the file. Argument nrow tells R to stop reading in values after it has read in a certain number of lines. Keep in mind that the header row doesn’t count towards the total rows allowed by nrow.\n\nflower_df_chunk <- read.table(\n \"data_files/flower.csv\", \n header = TRUE, \n sep = \",\", \n skip = 0, \n nrow = 3\n)\nflower_df_chunk\n\n treat nitrogen block height weight leafarea shootarea flowers\n1 tip medium 1 7.5 7.62 11.7 31.9 1\n2 tip medium 1 10.7 12.14 14.1 46.0 10\n3 tip medium 1 11.2 12.76 7.1 66.7 10\n\n\nread.table() has other arguments that you can tweak. You can consult the function’s help page to know more about it.\n\n\n4.1.1.2 Shortcuts for read.table\nR has shortcut functions that call read.table() in the background with different default values for popular types of files:\n\nread.table is the general purpose read function.\nread.csv reads comma-separated values (.csv) files.\nread.delim reads tab-delimited files.\nread.csv2 reads .csv files with European decimal format.\nread.delim2 reads tab-delimited files with European decimal format.\n\n\n\n4.1.1.3 read.fwf\nThere is a type of plain-text file called fixed-width file (.fwf). Instead of a symbol, a fixed-width file uses its layout to separate data cells. Each row is still in a single line, and each column begins at a specific number of characters from the left-hand side of the document. To correctly position its data, the file adds an arbitrary number of character spaces between data entries.\nIf our flowers data came in a fixed-width file, the first few lines would look like this:\n\ntreat nitrogen block height weight leafarea shootarea flowers\ntip medium 1 7.5 7.62 11.7 31.9 1\ntip medium 1 10.7 12.14 14.1 46.0 10\ntip medium 1 11.2 12.76 7.1 66.7 10\ntip medium 1 10.4 8.78 11.9 20.3 1\ntip medium 1 10.4 13.58 14.5 26.9 4\ntip medium 1 9.8 10.08 12.2 72.7 9\n\nFixed-width files may be visually intuitive, but they are difficult to work with. Perhaps because of this, R has a function for reading fixed-width files, but not for saving them.\nYou can read fixed-width files into R with the function read.fwf(). This function adds another argument to the ones from read.table(): widths, which should be a vector of numbers. Each ith entry of the widths vector should state the width (in characters) of the ith column of the data set.\n\n\n4.1.1.4 HTML links\nread.table and its shortcuts allow us to load data files directly from a website. Instead of using the file’s path or name, we can directly use a web address in the file argument of the function. Do make sure that you are using the web address that links directly to the file, not to a web page that has a link to the file."
+ "text": "4.1 Loading data\nOnce we know where to find data files in our computer, we can start loading them into R. Note, however, that we need specific ways to open different file formats.\n\n4.1.1 Plain text files\nPlain-text files are simple and many programs can read them. This is why many organizations (e.g., the Census Bureau, the Social Security Administration, etc.) publish their data as plain-text files.\nA plain-text file stores a table of data in a text document. Each row of the table is saved on its own line, and a simple symbol separates the cells within a row. This symbol is often a comma, but it can also be a tab, a pipe delimiter |, or any other character. Each file only uses one symbol to separate cells, which minimizes confusion.\nWe will work with data from this1 plain text file. Use Ctrl+Shift+s to download the file. Then save it in your working directory with the name “flower”.\n\n4.1.1.1 read.table\nread.table() can load plain-text files. The first argument of read.table() is the name of your file (if it is in your working directory), or the file path to your file (if it is not in your working directory).\n\nflower_df <- read.table(\"data_files/flower.csv\", header = TRUE, sep = \",\")\n\nIn the code above, I added two more arguments, header and sep. header tells R whether the first line of the file contains variable names instead of values. sep tells R the symbol that the file uses to separate the cells.\nSometimes a plain-text file starts with text that is not part of the data set. Or, maybe we want to read only part of a data set. Argument skip tells R to skip a specific number of lines before it starts reading in values from the file. Argument nrow tells R to stop reading in values after it has read in a certain number of lines. Keep in mind that the header row doesn’t count towards the total rows allowed by nrow.\n\nflower_df_chunk <- read.table(\n \"data_files/flower.csv\", \n header = TRUE, \n sep = \",\", \n skip = 0, \n nrow = 3\n)\nflower_df_chunk\n\n treat nitrogen block height weight leafarea shootarea flowers\n1 tip medium 1 7.5 7.62 11.7 31.9 1\n2 tip medium 1 10.7 12.14 14.1 46.0 10\n3 tip medium 1 11.2 12.76 7.1 66.7 10\n\n\nread.table() has other arguments that you can tweak. You can consult the function’s help page to know more about it.\n\n\n4.1.1.2 Shortcuts for read.table\nR has shortcut functions that call read.table() in the background with different default values for popular types of files:\n\nread.table is the general purpose read function.\nread.csv reads comma-separated values (.csv) files.\nread.delim reads tab-delimited files.\nread.csv2 reads .csv files with European decimal format.\nread.delim2 reads tab-delimited files with European decimal format.\n\n\n\n4.1.1.3 Excel files\nThe best way to load Excel files (.xlsx) into R is not to use Excel files. Instead, save these files as .csv or .text files and then use read.table. Excel files can include multiple spreadsheets, macros, colors, dynamic tables, and other hidden, complicated formats. All of these make it difficult for R to read the files properly. Plain text files are simpler, so we can load, and transfer them more easily.\nStill, there are ways to load Excel files into R if we really need to. R has no native way of loading these files, but we can use the package readxl. If you don’t have it installed, you can type install.packages(\"readxl\"). Then\n\n\n4.1.1.4 HTML links\nread.table and its shortcuts allow us to load data files directly from a website. Instead of using the file’s path or name, we can directly use a web address in the file argument of the function. Do make sure that you are using the web address that links directly to the file, not to a web page that has a link to the file.\n\n\n4.1.1.5 read.fwf\nFixed-width file (.fwf) is a type of plain-text file that, instead of a symbol, uses its layout to separate data cells. Each row is still in a single line, and each column begins at a specific number of characters from the left-hand side of the document. To correctly position its data, the file adds an arbitrary number of character spaces between data entries.\nIf our flowers data came in a fixed-width file, the first few lines would look like this:\n\ntreat nitrogen block height weight leafarea shootarea flowers\ntip medium 1 7.5 7.62 11.7 31.9 1\ntip medium 1 10.7 12.14 14.1 46.0 10\ntip medium 1 11.2 12.76 7.1 66.7 10\ntip medium 1 10.4 8.78 11.9 20.3 1\ntip medium 1 10.4 13.58 14.5 26.9 4\ntip medium 1 9.8 10.08 12.2 72.7 9\n\nFixed-width files may be visually intuitive, but they are difficult to work with. Perhaps because of this, R has a function for reading fixed-width files, but not for saving them.\nYou can read fixed-width files into R with the function read.fwf(). This function adds another argument to the ones from read.table(): widths, which should be a vector of numbers. Each ith entry of the widths vector should state the width (in characters) of the ith column of the data set."
},
{
"objectID": "04_basic_data_processing.html#cleaning-data",
@@ -348,5 +348,12 @@
"title": "4 Basic data processing",
"section": "",
"text": "You can find the original file here courtesy of Douglas et al. (see references).↩︎"
+ },
+ {
+ "objectID": "02_getting_started_with_r.html#acquiring-external-packages",
+ "href": "02_getting_started_with_r.html#acquiring-external-packages",
+ "title": "2 Getting Started with R",
+ "section": "2.9 Acquiring external packages",
+ "text": "2.9 Acquiring external packages\nWe don’t need to reinvent the wheel every time we need to do something that is not available in R’s default version. We can easily download packages from CRAN’s online repositories to get many useful functions.\nTo install a package from CRAN, we can use the install.packages() function. For example, if we wanted to install the package readxl (for loading .xslx files), we would need:\n\ninstall.packages(\"readxl\", dependencies = TRUE)\n\nThe argument dependencies tells R whether it should also download other packages that readxl needs to work.\nR may ask you to select a CRAN mirror, which simply put refers to the location of the servers you want to download from. Choose a mirror close to where you are.\nAfter installing a package, we need to load it into R before we can use its functions. To load the package readxl, we need to use the function library(), which will also load any other packages required to load readxl and may print additional information.\n\nlibrary(\"readxl\")\n\nEvery time we start a new R session we need to load the packages we need. If we try to run a function without loading its package first, we will get an error message saying that R could not find it.\nWriting all our library() statements at the top of our R scripts is almost always a good idea. This helps us know that we need to load the libraries at the start our sessions; and it helps others know quickly that they will need to have those libraries installed to be able to use our code.\nSometimes only need one or two functions from a library. To avoid loading the entire library, we can access the specific function directly by specifying the package name followed by two colons and then the function name. For example:\n\nreadxl::read_xlsx(\"fake_data_file.xlsx\")"
}
]
\ No newline at end of file