Skip to main content

Mining Chemical and Biological Data using High Dimensional Techniques in R/Python

Workshop to be held via zoom on March 22nd, 2-4 pm eastern time. Student enrollment will be limited to 15, to allow for effective hands-on training. Registration will open on February 18th.

Click here to register!


This two-hour online training workshop represents an application-focused, tutorial on coding in R/Python, in

cluding hands-on activities to analyze chemical and biological data. This workshop is designed to reach target audiences outside of the UNC-Chapel Hill main campus to foster engagement across research institutions. Topics will include:


  • Introduction to intermediate-level coding concepts, including tidyverse (coding via piping to “tidy” data) and ggplot2 (graphics visualization) R packages
  • High dimensional data reduction techniques, including PCA
  • Data clustering techniques, including k-means
  • High dimensional data visualizations, spanning simple scatter plots to more advanced PHATE visualizations
  • Real-world applications of these tools using a wildfire smoke chemical exposure dataset and a single-cell sequencing dataset to evaluate human exposures and underlying biology


This class will be open for upper-level undergraduates and graduate students, where some experience with coding is recommended. In preparation for this workshop, participants will be provided instructions on how to quickly set up an RStudio Cloud working environment to allow for easy, hands-on participation in a coding environment. We will also provide start-up script and example datasets.