Data Science for Biology Workshop Series

The third workshop in the series is scheduled for February 24-28, 2025.

Who is it for?

For researchers who are working with raw sequencing data studying health and disease who want to build a foundatino of bioinformatics skills. No command line experience necessary, but data analysis with R would be a huge plus.

Content covered:

Understanding the outputs of sequencing experiments.

Mapping viral sequences to reference databases.

Building and visualizing trees of viruses.

To apply for the third workshop, fill out this form

Thank you for participating in Workshop II!

Fill out the post-workshop survey here

Before you leave, we have some questions for you to understand if the workshop was valuable and how we can improve it.


Retreiving and running your workshop files

The instances will be going down, but your work has been saved!

  1. Navigate to this link to download the contents of your workshop folder from your IP Workshop II instance files here
  2. Follow the detailed instructions here to install R and RStudio on your computer. Ensure you are on R version 4.2 or higher.
  3. Run the following command to install the necessary packages in the R terminal
install.packages(c("dada2", "here", "tidyverse", "phyloseq", "rstatix", "micoViz", "ggpubr", "ggbeeswarm", "ggside", "vegan", "reshape2", "pander", "gplots", "RColorBrewer", "GUniFrac")) if (!require("BiocManager", quietly = TRUE))
  install.packages("BiocManager")
BiocManager::install("Heatplus", ask=TRUE).

Group Dataset Example Workflows

Asavela’s Demo
Liz’s Demo
Note: The filepaths to the datasets may not match!

Workshop II Discord
Join the discord here
Once you’ve joined, you can bookmark https://discord.com/channels/1158136582201692202/ to go straight to the channel

Schedule
Date/Time Topic Instructor
Monday, April 22nd ————————————————————– ————————————————————–
AM Workshop Introduction Scott Handley (Wash U School of Medicine)
Keynote Sinaye Ngcapu (CAPRISA)
Sequencing and Library Preperation Lindsay Droit (Wash U School of Medicine)
PM R Refresher (tidyverse/summary tables/data visualization) Scott Handley (Wash U School of Medicine)
Tuesday, April 23th ————————————————————– ————————————————————–
AM 16S Data Preprocessing Asavela Kama (CAPRISA)
PM Exploring High Dimensional Data Elizabeth Costello (Stanford University)
Wednesday, April 24th ————————————————————– ————————————————————–
AM Exploring High Dimensional Data Elizabeth Costello (Stanford University)
Correlations and Differential Abundance Testing Joseph Elsherbini (Ragon Institute)
PM Group Activity I Johnathan Shih (Ragon Institute)
Thursday, April 25th ————————————————————– ————————————————————–
AM VIRGO / Shotgun Metagenomics Michael France (UMD School of Medicine)
PM Group Activity II Johnathan Shih (Ragon Institute)
Evening Reception
Friday, April 26th ————————————————————– ————————————————————–
AM Keynote Jacques Ravel (UMD School of Medicine)
Group Activity review
PM Group Presentations Participants
Dataset Demo
Workshop Outro Scott Handley (Wash U School of Medicine)
————————————————————– ————————————————————–