-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Homework 3 clarification #36
Comments
HI @sandraemry , that is a good question! I think both are just fine. Just make sure your reviewer knows where to find the assertions -- perhaps by labelling that section in your R script with a large comment |
Hi @aammd, |
Hi @katcheung, you can read in your csv files with the columns specified with the type of data it is. So for me it would look like this: mydata <- read_csv("./data/flowcam_sum_tidy.csv", col_types = cols( Is that what you were asking about? Or maybe @aammd has a better solution? |
Hi @sandraemry & @katcheung , I think Sandra has a good answer here! You're right, factors are created when a csv or other file is read into R. So if you change the way you are reading the file, you change the way the result is represented in R. Sandra's example code shows one way to control exactly how each column is read. Another answer to your question @katcheung is that you can choose to work in a clean script (reading in your tidy CSV) or on the bottom of your old one. Just make sure it is clear for your peer reviewer. |
@aammd Regarding the metadata, should we have it as a routine to only work with files with metadata? As an example, should I save all my datafiles as csvy? It doesn't seem very useful to have metadata only for one part of your script (you add metadata in 01_rscript, but read in the data as csv in 02_analyse_data? |
@LinneaSandell this is an interesting question, and one we should return to in class! Briefly, I think that we are drawing a distinction here between "in progress" data and the "final version" of the dataset. So we add metadata only when we are "happy" with the way the dataset is organized. However, there are many other workflows that could be imagined, where metadata is created at the beginning, or in the middle, of a project |
Hi @aammd
Do we add assertions to our script that cleans the raw data? Or should I read in my tidy data set and write assertions for that one?
Thanks!
Sandra
The text was updated successfully, but these errors were encountered: