Skip to content

Commit

Permalink
add data info
Browse files Browse the repository at this point in the history
  • Loading branch information
rachelss committed Jan 28, 2020
1 parent c3010d2 commit 9462262
Show file tree
Hide file tree
Showing 2 changed files with 63 additions and 11 deletions.
14 changes: 14 additions & 0 deletions Oz_fires_DC_remix.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,7 @@ rainfall
You can also view the data in a separate window by clicking on its name in the Global Environment window.
Here you can also see some information about the data. Note how much data we have!
Click the arrow next to `rainfall` to view the columns and type of data in each column.
For more information about this dataset (i.e.\ the metadata) see [https://github.com/rfordatascience/tidytuesday/blob/master/data/2020/2020-01-07/readme.md](https://github.com/rfordatascience/tidytuesday/blob/master/data/2020/2020-01-07/readme.md)

Note that we've read our data from a website, but you can read local files as well.
These files are in csv format, which means plain text where columns are separated by commas.
Expand Down Expand Up @@ -467,6 +468,19 @@ temperature <- temperature %>%
mutate(year = year(date), month = month(date))
```

Challenge: Examine how the temperature has increased over time in Canberra. Filter for data from Canberra. Get the average max temperature in each month for each year. Spread year to be viewable.
```{r}
monthly_temps_CAN <- temperature %>%
filter(city_name == "CANBERRA") %>%
filter(temp_type == "max") %>%
group_by(month, year) %>%
summarize(avg_temp = mean(temperature, na.rm = T)) %>%
spread(year, avg_temp)
head(monthly_temps_CAN)
```


Now we can summarize by month and plot.

```{r}
Expand Down
60 changes: 49 additions & 11 deletions Oz_fires_DC_remix.html
Original file line number Diff line number Diff line change
@@ -1,11 +1,10 @@
<!DOCTYPE html>

<html xmlns="http://www.w3.org/1999/xhtml">
<html>

<head>

<meta charset="utf-8" />
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="generator" content="pandoc" />
<meta http-equiv="X-UA-Compatible" content="IE=EDGE" />

Expand Down Expand Up @@ -331,6 +330,7 @@
border: none;
display: inline-block;
border-radius: 4px;
background-color: transparent;
}

.tabset-dropdown > .nav-tabs.nav-tabs-open > li {
Expand Down Expand Up @@ -374,14 +374,14 @@ <h2>Setup</h2>
<p>Note that this lesson is adapted from the <a href="https://datacarpentry.org/R-ecology-lesson/index.html">Data Carpentry Ecology Lesson</a>. You can find a lot more information there. Text that has been copied from this lesson is highlighted in blue.</p>
<pre class="r"><code>#install.packages(&quot;tidyverse&quot;)
library(tidyverse)</code></pre>
<pre><code>## ── Attaching packages ────────────────────────────────────────────────── tidyverse 1.3.0 ──</code></pre>
<pre><code>## ggplot2 3.2.1 purrr 0.3.3
## tibble 2.1.3 dplyr 0.8.3
## tidyr 1.0.0 stringr 1.4.0
## readr 1.3.1 forcats 0.4.0</code></pre>
<pre><code>## ── Conflicts ───────────────────────────────────────────────────── tidyverse_conflicts() ──
## dplyr::filter() masks stats::filter()
## dplyr::lag() masks stats::lag()</code></pre>
<pre><code>## ── Attaching packages ────────────────────────────────────────────────── tidyverse 1.3.0 ──</code></pre>
<pre><code>## ggplot2 3.2.1 purrr 0.3.3
## tibble 2.1.3 dplyr 0.8.3
## tidyr 1.0.0 stringr 1.4.0
## readr 1.3.1 forcats 0.4.0</code></pre>
<pre><code>## ── Conflicts ───────────────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()</code></pre>
</div>
<div id="get-the-data" class="section level2">
<h2>Get the Data</h2>
Expand Down Expand Up @@ -470,7 +470,7 @@ <h2>Get the Data</h2>
## 9 009151 Perth 1967 01 09 NA NA &lt;NA&gt; -32.0 116.
## 10 009151 Perth 1967 01 10 NA NA &lt;NA&gt; -32.0 116.
## # … with 179,263 more rows, and 1 more variable: station_name &lt;chr&gt;</code></pre>
<p>You can also view the data in a separate window by clicking on its name in the Global Environment window. Here you can also see some information about the data. Note how much data we have! Click the arrow next to <code>rainfall</code> to view the columns and type of data in each column.</p>
<p>You can also view the data in a separate window by clicking on its name in the Global Environment window. Here you can also see some information about the data. Note how much data we have! Click the arrow next to <code>rainfall</code> to view the columns and type of data in each column. For more information about this dataset (i.e. the metadata) see <a href="https://github.com/rfordatascience/tidytuesday/blob/master/data/2020/2020-01-07/readme.md">https://github.com/rfordatascience/tidytuesday/blob/master/data/2020/2020-01-07/readme.md</a></p>
<p>Note that we’ve read our data from a website, but you can read local files as well. These files are in csv format, which means plain text where columns are separated by commas. This is a very simple format that avoids all the complexity of Excel. If you need to read an Excel file you can either export as a csv or use a different function to read the data.</p>
<p>For more information on data frames see the <a href="https://datacarpentry.org/R-ecology-lesson/02-starting-with-data.html#what_are_data_frames">Starting with Data</a> section of the Data Carpentry lesson.</p>
</div>
Expand Down Expand Up @@ -666,6 +666,44 @@ <h2>More data manipulation</h2>
<p>First let’s extract the month from the date information. This time we’ll do it a bit differently</p>
<pre class="r"><code>temperature &lt;- temperature %&gt;%
mutate(year = year(date), month = month(date))</code></pre>
<p>Challenge: Examine how the temperature has increased over time in Canberra. Filter for data from Canberra. Get the average max temperature in each month for each year. Spread year to be viewable.</p>
<pre class="r"><code>monthly_temps_CAN &lt;- temperature %&gt;%
filter(city_name == &quot;CANBERRA&quot;) %&gt;%
filter(temp_type == &quot;max&quot;) %&gt;%
group_by(month, year) %&gt;%
summarize(avg_temp = mean(temperature, na.rm = T)) %&gt;%
spread(year, avg_temp)
head(monthly_temps_CAN)</code></pre>
<pre><code>## # A tibble: 6 x 108
## # Groups: month [6]
## month `1913` `1914` `1915` `1916` `1917` `1918` `1919` `1920` `1921` `1922`
## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt;
## 1 1 NA 29.1 27.1 28.8 27.0 23.9 30.1 23.3 25.4 NaN
## 2 2 NA 29.7 30.0 26.9 22.8 24.6 29.4 27.3 25.8 NaN
## 3 3 NA 25.6 26.2 23.7 23.3 23.4 24.2 22.4 21.8 NaN
## 4 4 18.2 19.7 19.9 18.2 18.1 18.7 22.2 18.9 19.1 NaN
## 5 5 13.4 15.3 13.9 15.8 12.9 15.7 16.1 15.5 15.4 NaN
## 6 6 10.4 13.0 11.5 10.2 11.5 11.2 13.0 11.2 12.8 NaN
## # … with 97 more variables: `1923` &lt;dbl&gt;, `1924` &lt;dbl&gt;, `1925` &lt;dbl&gt;,
## # `1926` &lt;dbl&gt;, `1927` &lt;dbl&gt;, `1928` &lt;dbl&gt;, `1929` &lt;dbl&gt;, `1930` &lt;dbl&gt;,
## # `1931` &lt;dbl&gt;, `1932` &lt;dbl&gt;, `1933` &lt;dbl&gt;, `1934` &lt;dbl&gt;, `1935` &lt;dbl&gt;,
## # `1936` &lt;dbl&gt;, `1937` &lt;dbl&gt;, `1938` &lt;dbl&gt;, `1939` &lt;dbl&gt;, `1940` &lt;dbl&gt;,
## # `1941` &lt;dbl&gt;, `1942` &lt;dbl&gt;, `1943` &lt;dbl&gt;, `1944` &lt;dbl&gt;, `1945` &lt;dbl&gt;,
## # `1946` &lt;dbl&gt;, `1947` &lt;dbl&gt;, `1948` &lt;dbl&gt;, `1949` &lt;dbl&gt;, `1950` &lt;dbl&gt;,
## # `1951` &lt;dbl&gt;, `1952` &lt;dbl&gt;, `1953` &lt;dbl&gt;, `1954` &lt;dbl&gt;, `1955` &lt;dbl&gt;,
## # `1956` &lt;dbl&gt;, `1957` &lt;dbl&gt;, `1958` &lt;dbl&gt;, `1959` &lt;dbl&gt;, `1960` &lt;dbl&gt;,
## # `1961` &lt;dbl&gt;, `1962` &lt;dbl&gt;, `1963` &lt;dbl&gt;, `1964` &lt;dbl&gt;, `1965` &lt;dbl&gt;,
## # `1966` &lt;dbl&gt;, `1967` &lt;dbl&gt;, `1968` &lt;dbl&gt;, `1969` &lt;dbl&gt;, `1970` &lt;dbl&gt;,
## # `1971` &lt;dbl&gt;, `1972` &lt;dbl&gt;, `1973` &lt;dbl&gt;, `1974` &lt;dbl&gt;, `1975` &lt;dbl&gt;,
## # `1976` &lt;dbl&gt;, `1977` &lt;dbl&gt;, `1978` &lt;dbl&gt;, `1979` &lt;dbl&gt;, `1980` &lt;dbl&gt;,
## # `1981` &lt;dbl&gt;, `1982` &lt;dbl&gt;, `1983` &lt;dbl&gt;, `1984` &lt;dbl&gt;, `1985` &lt;dbl&gt;,
## # `1986` &lt;dbl&gt;, `1987` &lt;dbl&gt;, `1988` &lt;dbl&gt;, `1989` &lt;dbl&gt;, `1990` &lt;dbl&gt;,
## # `1991` &lt;dbl&gt;, `1992` &lt;dbl&gt;, `1993` &lt;dbl&gt;, `1994` &lt;dbl&gt;, `1995` &lt;dbl&gt;,
## # `1996` &lt;dbl&gt;, `1997` &lt;dbl&gt;, `1998` &lt;dbl&gt;, `1999` &lt;dbl&gt;, `2000` &lt;dbl&gt;,
## # `2001` &lt;dbl&gt;, `2002` &lt;dbl&gt;, `2003` &lt;dbl&gt;, `2004` &lt;dbl&gt;, `2005` &lt;dbl&gt;,
## # `2006` &lt;dbl&gt;, `2007` &lt;dbl&gt;, `2008` &lt;dbl&gt;, `2009` &lt;dbl&gt;, `2010` &lt;dbl&gt;,
## # `2011` &lt;dbl&gt;, `2012` &lt;dbl&gt;, `2013` &lt;dbl&gt;, `2014` &lt;dbl&gt;, `2015` &lt;dbl&gt;,
## # `2016` &lt;dbl&gt;, `2017` &lt;dbl&gt;, `2018` &lt;dbl&gt;, `2019` &lt;dbl&gt;</code></pre>
<p>Now we can summarize by month and plot.</p>
<pre class="r"><code>monthly_temps &lt;- temperature %&gt;%
group_by(month, year) %&gt;%
Expand Down

0 comments on commit 9462262

Please sign in to comment.