Skip to content

Latest commit

 

History

History
45 lines (41 loc) · 5.59 KB

04a_DataReusePlanExercise.md

File metadata and controls

45 lines (41 loc) · 5.59 KB

##Your Data Can Live Forever: How to Promote Data Reuse ####Learning Goals: This activity is designed to help you understand what someone outside your research project (or you in 5-10 years) would need to know about your data in order to build on your work.
####Target Audience: Graduate students, early-career researchers, project leads
####Delivery Format: in-person workshop
####Level: Beginner ####Time to complete:
45 mins in-person w/ add’l time to refine post-workshop
####What you'll need to start:
pen/pencil or text editor, [Data Reuse Plan worksheet] (04b_Data_Reuse_Plan_Worksheet.md), a data set with which you are familiar
####Before using this resource you should be familiar with: Not applicable
####For information on those topics, see: Not applicable
####Glossary of key terms:
Open data: Data that is made easily and freely available for anyone to access, use, and share without restrictions, the possible exception being a requirement of attribution.
Metadata: Metadata is information that describes, explains, locates, or otherwise makes it easier to find, access, and use an information resource (in this case, data). For example, metadata for a photograph may include the name of the photographer, when and where it was taken, as well as the type of camera and settings used to take the photograph.
Licensing: A license gives explicit permissions for the use of something. This is particularly important if you want to make your data open as some jurisdictions assign copyrights to data sets which limit their use. There are several types of licenses that are in common use for data. You can read more about them here: [http://www.dcc.ac.uk/resources/how-guides/license-research-data] (http://www.dcc.ac.uk/resources/how-guides/license-research-data)
Naming conventions: These are a set of predefined rules for the naming and structure of folders, files, field names, etc. (e.g. all files begin with a date, location and project name.) Naming conventions help provide context to a data set, as well as make sure a standard of data collection and management is being followed by all members of a team.
Permanent Identifiers: A permanent identifier (or PID) is a set of numbers and/or characters, frequently in the form of a URL, that points to the location of a resource. PIDs are set up in such a way that even though the storage location of the resource may change over time (e.g. moving data from one university server to another), the PID will always point to the correct location. DOI is a commonly known type of PID.
####Intro to material: Data reuse saves time and accelerates the pace of scientific discovery. By making your data open and available to others, you make it possible for future researchers to answer questions that haven’t yet been asked. Thinking about data reuse in advance and documenting it, also helps you remember your processes and workflow to defend your research... think back to second grade when your teacher told you to “show your work”.

When it comes to making your work reusable, “the devil is in the details”. Upon completion of this exercise, you will have a detailed data reuse plan which you can save as a README or text file to store with your data so others can understand and reuse your data. ####Steps to complete:

  1. Break into evenly distributed groups of 2-5 people. (2 mins)
  2. Identify one volunteer to be the “Researcher” and describe their research data for this exercise. This person will need to be fairly familiar with how and why the data was collected. (2 mins)
  3. Identify a note taker to record responses to questions from the group about the data set. (1 min)
  4. Using the Data Reuse Plan worksheet as a guide, members of the group ask questions of the “Researcher” about her or his data set while the note taker records responses. (The note taker can ask questions, too!) As you ask questions, think about how you would (if you could) respond to a similar question for your data set. (30-40 mins)
  5. If you have time, upon completion of the worksheet, review your responses and make sure they would be clear to someone viewing your data set for the first time. You are writing this for someone you’ve never met. Avoid jargon and abbreviations where possible.
  6. Review the questions below and be prepared to share out your responses with the larger group.

#####Questions to Keep in Mind: If you don’t have some of the applicable information asked for in the worksheet, is there a way you can get it? If not, is there something you could have done differently during your research project to collect that information?
Which parts of the worksheet are particularly challenging? Why? What research best practices could be put into place to make it easier?
Are there pieces of information missing from this worksheet that would help someone understand your data with an intent to reuse it?

####Follow-up resources and materials:

####Credits and attribution: Metadata definition adapted from “Understanding Metadata” published by NISO Press (2004): [http://www.niso.org/publications/press/UnderstandingMetadata.pdf] (http://www.niso.org/publications/press/UnderstandingMetadata.pdf)