diff --git a/docs/01_the_r_environment.html b/docs/01_the_r_environment.html index e6c134c..1887fc0 100644 --- a/docs/01_the_r_environment.html +++ b/docs/01_the_r_environment.html @@ -119,67 +119,49 @@ - - - @@ -697,7 +679,7 @@

< diff --git a/docs/02_getting_started_with_r.html b/docs/02_getting_started_with_r.html index 2617051..a63f05f 100644 --- a/docs/02_getting_started_with_r.html +++ b/docs/02_getting_started_with_r.html @@ -64,7 +64,7 @@ - + @@ -157,65 +157,47 @@ 2  Getting Started with R - - - - @@ -521,7 +503,7 @@

snake case naming). You can use whichever one you want. Just be consistent. +
  • If you need to create objects with multiple words in their name, separate them with an underscore (my_value) or a dot (my.value), or capitalize the different words (MyValue). I like the underscore format (called snake case naming). You can use whichever one you want. Just be consistent.
  • Use informative names. It is quick and easy to use names like x or my_value. But your code will be easier and faster to understand if your objects have names that illustrate what you want to do with them. Your colleagues and your future self will really appreciate it.
  • @@ -1421,8 +1403,8 @@

    - - 3  Using Scripts + + 3  Data for Analysis diff --git a/docs/03_using_scripts.html b/docs/03_using_scripts.html deleted file mode 100644 index 7711bba..0000000 --- a/docs/03_using_scripts.html +++ /dev/null @@ -1,851 +0,0 @@ - - - - - - - - - -Introduction to R - 3  Using Scripts - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
    -
    - -
    - -
    - - -
    - - - -
    - -
    -
    -

    3  Using Scripts

    -
    - - - -
    - - - - -
    - - - -
    - - -

    Until now, we have submitted each command to R by typing directly at the command prompt. In almost all situations, it is much better to type code in a separate file called a script file. There are many advantages of using script files.

    -
      -
    • Repeatability

    • -
    • Editing

    • -
    • Submitting many commands at once

    • -
    -
    -

    3.1 Scripts in R GUI

    -

    To open an R script, go to File -> New Script. You can type your commands in the resulting windows. It is useful to save your script files with a .R extension. Then the operating system recognizes this as an R file.

    -

    After opening a new script, type

    -

    > history()

    -

    A new window will open with the last 25 commands you have used. You can get a longer history with, say,

    -

    > history(100)

    -

    Select all your commands and copy-paste them to the new script. Save the file with a .R extension.

    -

    There are several ways of submitting commands from the R script window.

    -
      -
    • Copy and Paste from the script window to the interpreter window

    • -
    • Control-r

    • -
    • The source command. If your file is saved as myfile.R, you can run all the commands in the file by typing the following line at the command prompt.

      -

      > source("myfile.R")

      -

      In order for R to read your script, you must use the full path or be in the correct working directory. To change the current working directory, go to File -> Change Dir, then browse to the appropriate folder.

      -

      To see the files in the current working directory, type:

      -

      > dir()

      -

      Another possibility is to find the file using

      -

      > source(file.choose())

      -

      but this does not change the working directory.

    • -
    -
    -
    -

    3.2 Comments

    -

    Documentation and formatting are essential to writing effective R code. If you come back to a project after a few months (days? hours?), you want to know what the code is doing without retracing every single step. Comments in R can be inserted with the # symbol. R will not process the rest of the line after the #.

    -

    > 5 < 6

    -

    > 5 # < 6

    -

    The following is an example of a commented R script. Some of the functions we have used before; others will be explained later. Let’s open it in RStudio and see what it does.

    -
    -
    -

    3.3 RStudio

    -

    RStudio is an interface to R. It organizes the user’s screen into windows that display programs (scripts), objects, graphics, and the R interpreter. Launch RStudio from the start menu. Click on the folder in the upper left corner and browse to the folder containing the script “sample script.R”. The following should open in the upper left window.

    -
    -
    -

    3.4 Sample Script

    -

    # sample script.R

    -

    # Chris Andrews

    -

    # Created 2015 04 01

    -

    # Last Modified 2019 02 03

    -

    # This script analyzes the Life Cycle Savings data.

    -

    # See help(LifeCycleSavings) for more details.

    -

    #

    -

    # Shorten the name and make local copy

    -

    Life = LifeCycleSavings

    -

    # Examine Structure

    -

    head(Life)

    -

    dim(Life)

    -

    str(Life)

    -

    # Descriptives

    -

    summary(Life)

    -

    # Pairwise associations

    -

    cor(Life)

    -

    pairs(Life)

    -

    # Fit a multiple regression model

    -

    mr.mod = lm(sr   pop15 + pop75 + dpi + ddpi, data=Life)

    -

    summary(mr.mod)

    -

    anova(mr.mod)

    -

    # Plot the fit

    -

    par(mfrow=c(2,2), las=1, mar=c(4.5,4.5,2,1))

    -

    plot(mr.mod)

    -

    Use “Ctrl-Enter” to submit each line of the script. Take note of what happens as each line is submitted to the interpreter.

    - - -
    - -
    - - -
    - - - - - \ No newline at end of file diff --git a/docs/04_objects.html b/docs/04_objects.html deleted file mode 100644 index 1764444..0000000 --- a/docs/04_objects.html +++ /dev/null @@ -1,976 +0,0 @@ - - - - - - - - - -Introduction to R - 4  Objects - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
    -
    - -
    - -
    - - -
    - - - -
    - -
    -
    -

    4  Objects

    -
    - - - -
    - - - - -
    - - - -
    - - -
    -

    4.1 Assignment

    -

    An object is an entity that contains information and can be manipulated by commands. R has two main commands for assigning an object: ‘\(<\)-’ and ‘=’.

    -

    > x <- 5

    -

    > x = 5

    -

    We will use ‘=’ throughout this document. However, many R users prefer ‘<-’, because ‘=’ is used for other things, too. A third method is very rarely used:

    -

    > 5 -> x

    -

    Each of the previous commands assigns the number 5 to the object x. Notice that R produces no output when the above commands are run. In order to see what R has done, type:

    -

    > ls()

    -

    and/or look at the environment window in the upper right corner. Now type

    -

    > x

    -

    When you submit a command to R, one of three things can happen.

    -
      -
    1. You see a result: e.g.,

      -

      > x

      -

      R prints the value of the expression.

    2. -
    3. You see nothing except another command prompt: e.g.,

      -

      > y = log(x)

      -

      For an assignment, R stores the value of log(x) in the object y, but produces no output.

    4. -
    5. You see an error message: e.g.,

      -

      > y = lg(x)

      -

      Look at error messages – they can be informative!

    6. -
    -
    -
    -

    4.2 Manipulating Objects

    -

    We can perform mathematical operations on objects such as x.

    -

    > x + 2

    -

    Notice that x has not changed:

    -

    > x

    -

    We can change the value of x:

    -

    > x = x + 2

    -

    > x

    -

    Cautionary Tip: It is very important to use caution when writing over a variable as above. If you need to use x later on, be sure you are using the correct value!

    -

    Start from scratch and perform operations on two objects.

    -

    > x = 5

    -

    > y = 2

    -

    > x - y

    -

    If two objects are assigned to have the same value, they can be changed to differ. (Assigned by value not by assigned by reference, for those of you who know what that means.)

    -

    > a = 3

    -

    > b = a  # Note: Assignment

    -

    > b == a # Note: Test of Equality

    -

    > a = a + 1

    -

    > a

    -

    The value of b didn’t change.

    -

    > b

    -

    Assign a vector of numbers to the object x

    -

    > x = c(3, 5, 9, 10)

    -

    > x

    -

    Get a list of the objects in the workspace.

    -

    > ls()

    -

    Remove an object.

    -

    > rm(x)

    -
    -
    -

    4.3 Indexing Objects

    -

    Situations frequently arise when you want to access select portions of a database. In this section, we discuss how to extract elements of vectors and matrices.

    -
    -

    4.3.1 Indexing Vectors

    -

    > x = c(13,21,99,10,0,-6)

    -

    Suppose that we only need the first element of the vector x. To extract the first element, we type the name of the entire vector, followed by the index we want to extract enclosed in brackets.

    -

    > x[1]

    -

    We can save the extracted part to a new object

    -

    > z = x[1]

    -

    > z

    -

    We often will want to extract more than one element of a vector. Each of the following two lines of code extracts the first three elements of the vector x.

    -

    > x[c(1,2,3)]

    -

    > x[1:3]

    -

    What happens if we try to extract the first three elements in the following way?

    -

    > x[1,2,3]

    -

    Elements can be extracted in any order and elements can be extracted any number of times. All of the following are legitimate methods of extracting multiple elements from a vector.

    -

    > x[c(2,4,5)]

    -

    > x[c(4,5,1)]

    -

    > x[c(5,1,5,2,1,1,1,5)]

    -

    The following code extracts all elements of x except the second.

    -

    > x[-2]

    -

    What will this do?

    -

    > x[-c(2,4)]

    -
    -
    -

    4.3.2 Indexing Matrices

    -

    To extract an element from a matrix, you may specify two values: the row value and the column value. The row and column are separated by a column.

    -

    > M1 = matrix(1:12, nrow=3, byrow=TRUE) # (this is obj5 from before, so M1 = obj5 works too)

    -

    > M1

    -

    Pick out the number from the second row and third column.

    -

    > M1[2,3]

    -

    You can simultaneously select multiple rows and multiple columns.

    -

    > M1[2,c(1,3)]

    -

    > M1[c(2,3),c(1,2)]

    -

    If nothing is specified in the row position (before the comma), then every row is kept. Similarly, every column is kept if nothing is specified in the column position.

    -

    > M1[,c(2,3)]

    -

    > M1[c(1,2),]

    -

    If nothing is specified in either position, the entire matrix is returned.

    -
    -
    -
    -

    4.4 Index Assignment

    -

    In addition to extracting certain indices, it is also possible to assign new values to certain elements of a vector or matrix.

    -

    The following two lines of code change an element of the vector x and the matrix M1.

    -

    > x[3] = 5

    -

    > M1[2,3] = 6

    -
    -
    -

    4.5 Aside: Missing Index?

    -

    If an index is missing, it might be any index. This is rarely what you want: Avoid missing values in your index.

    -

    > x[NA]

    -
    -
    -

    4.6 Object Classes

    -

    So far we seem to have been working exclusively with numeric objects. R can store objects of many different types. Suppose you are working with a data set that includes both quantitative and categorical variables. R can store these as different classes. Let’s begin by looking at two basic classes, numeric and character.

    -

    > x = 12

    -

    > class(x)

    -

    > y = c(3,5,2)

    -

    > class(y)

    -

    R stores both the number 12 and the vector c(3,5,2) as an object of the class numeric. Strings are stored as characters.

    -

    > x = "Hi"

    -

    > class(x)

    -

    > y = c("sample", "string")

    -

    > class(y)

    -

    Elements of vectors and matrices must be of the same class.

    -

    > mix = c("aa", -2)

    -

    > mix

    -

    > class(mix)

    -

    > mix[2]

    -

    > class(mix[2])

    -

    When working with data, this will create problems if a column representing a quantitative variable contains character text. The numeric is promoted to character.

    -
    -
    -

    4.7 How to Mix Variables of Different Classes

    -

    Matrices are not well-suited for storing data sets. Data sets frequently contain different types of variables (quantitative, qualitative). Matrices force all elements to be of the same class. A data.frame is particularly adept at handling data of different classes.

    -

    > num = c(2,9,6,5)

    -

    > char = LETTERS[c(24,24:26)]

    -

    > dat = data.frame(num, char, stringsAsFactors=FALSE)

    -

    > dat

    -

    > class(dat)

    -

    Though data analysts will rarely spend their time investigating a data set as small this one, exploring data sets such as these can be helpful in learning R’s capabilities. In the following code, we investigate the names and dimensions of the data set dat; we also investigate the properties of the columns of dat.

    -

    > names(dat)

    -

    > dim(dat)

    -

    > nrow(dat)

    -

    > ncol(dat)

    -

    > class(dat[,1])

    -

    > class(dat[,2])

    -

    R stores the first column as numeric and the second column as a character. summary gives a numerical summary of numeric variables and little useful information for character variables.

    -

    > summary(dat)

    -

    It is likely that you want to store a categorical variable as a factor rather than a character vector. The default behavior of data.frame to do the conversion.

    -

    > dat = data.frame(num, fac=char)

    -

    Now R stores the first column as numeric and the second column as a factor. summary gives a numerical summary of numeric variables and a table for categorical variables.

    -

    > class(dat[,2])

    -

    > summary(dat)

    -

    Keeping track of column numbers can be tedious. It is often more convenient and cleaner to index by the column name. Name indexing uses the dollar sign ($) or double square braces ([[]]).

    -

    > dat$num # Or dat[["num"]]

    -

    > dat$fac # Or dat[["fac"]]

    -

    Factors can be created explicitly (not just as a side effect of the data.frame function)

    -

    > fac = factor(char)

    -

    The levels function returns the levels of a factor.

    -

    > levels(fac)

    - - -
    - -
    - - -
    - - - - - \ No newline at end of file diff --git a/docs/05_data_for_analysis.html b/docs/05_data_for_analysis.html index e134d39..25f28f7 100644 --- a/docs/05_data_for_analysis.html +++ b/docs/05_data_for_analysis.html @@ -2,12 +2,12 @@ - + -Introduction to R - 5  Data for Analysis +Introduction to R - 3  Data for Analysis - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
    -
    - -
    - -
    - - -
    - - - -
    - -
    -
    -

    11  Creating Functions

    -
    - - - -
    - - - - -
    - - - -
    - - -

    Another very nice feature of R is the ability to easily write your own programs and functions. We will begin by creating a new function called add.machine that will simply sum two numbers:

    -

    add.machine\(=\)function(num1, num2){
    -result\(=\)num1\(+\)num2
    -return(result)
    -}

    -

    The following are important components of the code above:

    -
      -
    • add.machine is the name of the newly created object

    • -
    • function declares that add.machine will be a function

    • -
    • num1 and num2 are the arguments that add.machine will take as input

    • -
    • The body of the function is enclosed in curly braces

    • -
    • return (is optional) but specifies the output that is returned from the function

    • -
    -

    Let’s test out the newly created function

    -

    > add.machine

    -

    > add.machine(3,5)

    -

    What happens if we don’t specify valid arguments?

    -

    > add.machine(3)

    -

    > add.machine(3, "Hi")

    - - - -
    - - -
    - - - - - \ No newline at end of file diff --git a/docs/12_programming.html b/docs/12_programming.html index 8eb69b0..80980dd 100644 --- a/docs/12_programming.html +++ b/docs/12_programming.html @@ -2,12 +2,12 @@ - + -Introduction to R - 12  Programming +Introduction to R - 9  Programming