Introduction to Biomedical Data Science
This 2-day workshop aims to orient learners from using traditional spreadsheet programs to using a programming language (i.e. R) for their data analysis pipelines. It assumes no prior programming knowledge and will introduce how to think about data from spreadsheets and how to load and view them programmatically using real-world data examples. The sessions will review how data scientists work with data and the tools they use to manipulate, view, and analyze their data. At the end of the workshop series, learners will be able to:
1. Identify when spreadsheets are useful
2. Assess when a task should not be done in a spreadsheet software
3. Name the features of a tidy/clean dataset
4. Transform data for analysis
5. Breakdown data processing into smaller individual (and more manageable steps)
6. Build a data processing pipeline that can be used in multiple programs
7. Construct a plot and table for exploratory data analysis
8. Calculate, interpret, and communicate an appropriate statistical analysis of the data.
This workshop is part of a research study looking to assess the effectiveness of data science education materials. You will be invited to join the study by taking a series of workshop surveys. Your participation in the study will not affect your ability to take the workshop.
The workshop will take place on Monday May 17 from 2:00 pm - 4:30 pm and on Tuesday May 18 from 2:00 pm - 4:30 pm. The sessions will be led online: the Zoom link and password will be sent to registrants in the confirmation and reminder emails.
Prerequisites: None, though registration is required
- Monday, May 17, 2021
- 2:00PM - 4:30PM
- Daniel Chen, MPH, Virginia Tech