

Date & Time: Pre-recorded (previously delivered in the Texas A&M Superfund Big Data Series 2021)
Instructors:
- Fred Wright | North Carolina State University’s Bioinformatics Research Center | fred_wright@ncsu.edu
- Burcu Beykal | University of Connecticut Department of Chemical & Biomolecular Engineering | burcu.beykal@uconn.edu
- Allison Dickey | North Carolina State University’s Bioinformatics Research Center| andickey@ncsu.edu
MANIPULATING AND DISPLAYING BIG(ISH) DATA IN R
Session Content:
This session will provide a tutorial on commonly used and useful aspects of R, using the RStudio interface. Example datasets will be used that are relevant to bench scientists and environmental researchers. We do not assume any prior familiarity with R.

- Introduction to R
- An introduction to RStudio and installation of packages
- Reading data into R in various formats
- Exploring data types and dimensions
- Extracting data and identifying missing data
- Sorting data and using the apply function
- Merging data frame
- Data visualization and analysis
- Plotting/graphics in base R (scatterplots, histograms, boxplots, etc.)
- Basic summary statistics
- Basic inferential statistics (e.g. t-tests, ANOVA, multiple test correction)
- Clustering and dimensional reduction (e.g. PCA)

- More Advanced Visualization
- Using ggplot2
- Customizing plots
- Spatial displays and maps in ggplot2
- Interactive plots using plotly
Session Recording:
Download Slide Deck (PDF) | Download Supporting Files (ZIP) (right-click and save file)
Post about the program on social media and use this hashtag!
#TAMIDSBiomedicalDataScience