add data info

rhodyrstats · Jan 28, 2020 · 9462262 · 9462262
1 parent c3010d2
commit 9462262
Show file tree

Hide file tree

Showing 2 changed files with 63 additions and 11 deletions.
diff --git a/Oz_fires_DC_remix.Rmd b/Oz_fires_DC_remix.Rmd
@@ -108,6 +108,7 @@ rainfall
 You can also view the data in a separate window by clicking on its name in the Global Environment window.
 Here you can also see some information about the data. Note how much data we have!
 Click the arrow next to `rainfall` to view the columns and type of data in each column.
+For more information about this dataset (i.e.\ the metadata) see [https://github.com/rfordatascience/tidytuesday/blob/master/data/2020/2020-01-07/readme.md](https://github.com/rfordatascience/tidytuesday/blob/master/data/2020/2020-01-07/readme.md)
 
 Note that we've read our data from a website, but you can read local files as well.
 These files are in csv format, which means plain text where columns are separated by commas.
@@ -467,6 +468,19 @@ temperature <- temperature %>%
   mutate(year = year(date), month = month(date))
 ```
 
+Challenge: Examine how the temperature has increased over time in Canberra. Filter for data from Canberra. Get the average max temperature in each month for each year. Spread year to be viewable.
+```{r}
+monthly_temps_CAN <- temperature %>%
+  filter(city_name == "CANBERRA") %>%
+  filter(temp_type == "max") %>%
+  group_by(month, year) %>%
+  summarize(avg_temp = mean(temperature, na.rm = T)) %>%
+  spread(year, avg_temp)
+head(monthly_temps_CAN)
+
+```
+
+
 Now we can summarize by month and plot.
 
 ```{r}

diff --git a/Oz_fires_DC_remix.html b/Oz_fires_DC_remix.html
@@ -1,11 +1,10 @@
 <!DOCTYPE html>
 
-<html xmlns="http://www.w3.org/1999/xhtml">
+<html>
 
 <head>
 
 <meta charset="utf-8" />
-<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
 <meta name="generator" content="pandoc" />
 <meta http-equiv="X-UA-Compatible" content="IE=EDGE" />
 
@@ -331,6 +330,7 @@
   border: none;
   display: inline-block;
   border-radius: 4px;
+  background-color: transparent;
 }
 
 .tabset-dropdown > .nav-tabs.nav-tabs-open > li {
@@ -374,14 +374,14 @@ <h2>Setup</h2>
 <p>Note that this lesson is adapted from the <a href="https://datacarpentry.org/R-ecology-lesson/index.html">Data Carpentry Ecology Lesson</a>. You can find a lot more information there. Text that has been copied from this lesson is highlighted in blue.</p>
 <pre class="r"><code>#install.packages(&quot;tidyverse&quot;)
 library(tidyverse)</code></pre>
-<pre><code>## ── Attaching packages ────────────────────────────────────────────────── tidyverse 1.3.0 ──</code></pre>
-<pre><code>## ✔ ggplot2 3.2.1     ✔ purrr   0.3.3
-## ✔ tibble  2.1.3     ✔ dplyr   0.8.3
-## ✔ tidyr   1.0.0     ✔ stringr 1.4.0
-## ✔ readr   1.3.1     ✔ forcats 0.4.0</code></pre>
-<pre><code>## ── Conflicts ───────────────────────────────────────────────────── tidyverse_conflicts() ──
-## ✖ dplyr::filter() masks stats::filter()
-## ✖ dplyr::lag()    masks stats::lag()</code></pre>
+<pre><code>## ── Attaching packages ─────────────────────────────────────────────────── tidyverse 1.3.0 ──</code></pre>
+<pre><code>## ✓ ggplot2 3.2.1     ✓ purrr   0.3.3
+## ✓ tibble  2.1.3     ✓ dplyr   0.8.3
+## ✓ tidyr   1.0.0     ✓ stringr 1.4.0
+## ✓ readr   1.3.1     ✓ forcats 0.4.0</code></pre>
+<pre><code>## ── Conflicts ────────────────────────────────────────────────────── tidyverse_conflicts() ──
+## x dplyr::filter() masks stats::filter()
+## x dplyr::lag()    masks stats::lag()</code></pre>
 </div>
 <div id="get-the-data" class="section level2">
 <h2>Get the Data</h2>
@@ -470,7 +470,7 @@ <h2>Get the Data</h2>
 ##  9 009151       Perth      1967 01    09          NA     NA &lt;NA&gt;    -32.0  116.
 ## 10 009151       Perth      1967 01    10          NA     NA &lt;NA&gt;    -32.0  116.
 ## # … with 179,263 more rows, and 1 more variable: station_name &lt;chr&gt;</code></pre>
-<p>You can also view the data in a separate window by clicking on its name in the Global Environment window. Here you can also see some information about the data. Note how much data we have! Click the arrow next to <code>rainfall</code> to view the columns and type of data in each column.</p>
+<p>You can also view the data in a separate window by clicking on its name in the Global Environment window. Here you can also see some information about the data. Note how much data we have! Click the arrow next to <code>rainfall</code> to view the columns and type of data in each column. For more information about this dataset (i.e. the metadata) see <a href="https://github.com/rfordatascience/tidytuesday/blob/master/data/2020/2020-01-07/readme.md">https://github.com/rfordatascience/tidytuesday/blob/master/data/2020/2020-01-07/readme.md</a></p>
 <p>Note that we’ve read our data from a website, but you can read local files as well. These files are in csv format, which means plain text where columns are separated by commas. This is a very simple format that avoids all the complexity of Excel. If you need to read an Excel file you can either export as a csv or use a different function to read the data.</p>
 <p>For more information on data frames see the <a href="https://datacarpentry.org/R-ecology-lesson/02-starting-with-data.html#what_are_data_frames">Starting with Data</a> section of the Data Carpentry lesson.</p>
 </div>
@@ -666,6 +666,44 @@ <h2>More data manipulation</h2>
 <p>First let’s extract the month from the date information. This time we’ll do it a bit differently</p>
 <pre class="r"><code>temperature &lt;- temperature %&gt;%
   mutate(year = year(date), month = month(date))</code></pre>
+<p>Challenge: Examine how the temperature has increased over time in Canberra. Filter for data from Canberra. Get the average max temperature in each month for each year. Spread year to be viewable.</p>
+<pre class="r"><code>monthly_temps_CAN &lt;- temperature %&gt;%
+  filter(city_name == &quot;CANBERRA&quot;) %&gt;%
+  filter(temp_type == &quot;max&quot;) %&gt;%
+  group_by(month, year) %&gt;%
+  summarize(avg_temp = mean(temperature, na.rm = T)) %&gt;%
+  spread(year, avg_temp)
+head(monthly_temps_CAN)</code></pre>
+<pre><code>## # A tibble: 6 x 108
+## # Groups:   month [6]
+##   month `1913` `1914` `1915` `1916` `1917` `1918` `1919` `1920` `1921` `1922`
+##   &lt;dbl&gt;  &lt;dbl&gt;  &lt;dbl&gt;  &lt;dbl&gt;  &lt;dbl&gt;  &lt;dbl&gt;  &lt;dbl&gt;  &lt;dbl&gt;  &lt;dbl&gt;  &lt;dbl&gt;  &lt;dbl&gt;
+## 1     1   NA     29.1   27.1   28.8   27.0   23.9   30.1   23.3   25.4    NaN
+## 2     2   NA     29.7   30.0   26.9   22.8   24.6   29.4   27.3   25.8    NaN
+## 3     3   NA     25.6   26.2   23.7   23.3   23.4   24.2   22.4   21.8    NaN
+## 4     4   18.2   19.7   19.9   18.2   18.1   18.7   22.2   18.9   19.1    NaN
+## 5     5   13.4   15.3   13.9   15.8   12.9   15.7   16.1   15.5   15.4    NaN
+## 6     6   10.4   13.0   11.5   10.2   11.5   11.2   13.0   11.2   12.8    NaN
+## # … with 97 more variables: `1923` &lt;dbl&gt;, `1924` &lt;dbl&gt;, `1925` &lt;dbl&gt;,
+## #   `1926` &lt;dbl&gt;, `1927` &lt;dbl&gt;, `1928` &lt;dbl&gt;, `1929` &lt;dbl&gt;, `1930` &lt;dbl&gt;,
+## #   `1931` &lt;dbl&gt;, `1932` &lt;dbl&gt;, `1933` &lt;dbl&gt;, `1934` &lt;dbl&gt;, `1935` &lt;dbl&gt;,
+## #   `1936` &lt;dbl&gt;, `1937` &lt;dbl&gt;, `1938` &lt;dbl&gt;, `1939` &lt;dbl&gt;, `1940` &lt;dbl&gt;,
+## #   `1941` &lt;dbl&gt;, `1942` &lt;dbl&gt;, `1943` &lt;dbl&gt;, `1944` &lt;dbl&gt;, `1945` &lt;dbl&gt;,
+## #   `1946` &lt;dbl&gt;, `1947` &lt;dbl&gt;, `1948` &lt;dbl&gt;, `1949` &lt;dbl&gt;, `1950` &lt;dbl&gt;,
+## #   `1951` &lt;dbl&gt;, `1952` &lt;dbl&gt;, `1953` &lt;dbl&gt;, `1954` &lt;dbl&gt;, `1955` &lt;dbl&gt;,
+## #   `1956` &lt;dbl&gt;, `1957` &lt;dbl&gt;, `1958` &lt;dbl&gt;, `1959` &lt;dbl&gt;, `1960` &lt;dbl&gt;,
+## #   `1961` &lt;dbl&gt;, `1962` &lt;dbl&gt;, `1963` &lt;dbl&gt;, `1964` &lt;dbl&gt;, `1965` &lt;dbl&gt;,
+## #   `1966` &lt;dbl&gt;, `1967` &lt;dbl&gt;, `1968` &lt;dbl&gt;, `1969` &lt;dbl&gt;, `1970` &lt;dbl&gt;,
+## #   `1971` &lt;dbl&gt;, `1972` &lt;dbl&gt;, `1973` &lt;dbl&gt;, `1974` &lt;dbl&gt;, `1975` &lt;dbl&gt;,
+## #   `1976` &lt;dbl&gt;, `1977` &lt;dbl&gt;, `1978` &lt;dbl&gt;, `1979` &lt;dbl&gt;, `1980` &lt;dbl&gt;,
+## #   `1981` &lt;dbl&gt;, `1982` &lt;dbl&gt;, `1983` &lt;dbl&gt;, `1984` &lt;dbl&gt;, `1985` &lt;dbl&gt;,
+## #   `1986` &lt;dbl&gt;, `1987` &lt;dbl&gt;, `1988` &lt;dbl&gt;, `1989` &lt;dbl&gt;, `1990` &lt;dbl&gt;,
+## #   `1991` &lt;dbl&gt;, `1992` &lt;dbl&gt;, `1993` &lt;dbl&gt;, `1994` &lt;dbl&gt;, `1995` &lt;dbl&gt;,
+## #   `1996` &lt;dbl&gt;, `1997` &lt;dbl&gt;, `1998` &lt;dbl&gt;, `1999` &lt;dbl&gt;, `2000` &lt;dbl&gt;,
+## #   `2001` &lt;dbl&gt;, `2002` &lt;dbl&gt;, `2003` &lt;dbl&gt;, `2004` &lt;dbl&gt;, `2005` &lt;dbl&gt;,
+## #   `2006` &lt;dbl&gt;, `2007` &lt;dbl&gt;, `2008` &lt;dbl&gt;, `2009` &lt;dbl&gt;, `2010` &lt;dbl&gt;,
+## #   `2011` &lt;dbl&gt;, `2012` &lt;dbl&gt;, `2013` &lt;dbl&gt;, `2014` &lt;dbl&gt;, `2015` &lt;dbl&gt;,
+## #   `2016` &lt;dbl&gt;, `2017` &lt;dbl&gt;, `2018` &lt;dbl&gt;, `2019` &lt;dbl&gt;</code></pre>
 <p>Now we can summarize by month and plot.</p>
 <pre class="r"><code>monthly_temps &lt;- temperature %&gt;%
   group_by(month, year) %&gt;%