Skip Navigation

Manipulating and Displaying Big(ish) Data in R

Fred Wright

Date & Time: Pre-recorded (previously delivered in the Texas A&M Superfund Big Data Series 2021)

Instructors:

MANIPULATING AND DISPLAYING BIG(ISH) DATA IN R 

Session Content:

This session will provide a tutorial on commonly used and useful aspects of R, using the RStudio interface. Example datasets will be used that are relevant to bench scientists and environmental researchers. We do not assume any prior familiarity with R.

Burcu Beykal
  • Introduction to R
    • An introduction to RStudio and installation of packages
    • Reading data into R in various formats
    • Exploring data types and dimensions
    • Extracting data and identifying missing data
    • Sorting data and using the apply function
    • Merging data frame
  • Data visualization and analysis
    • Plotting/graphics in base R (scatterplots, histograms, boxplots, etc.)
    • Basic summary statistics
    • Basic inferential statistics (e.g. t-tests, ANOVA, multiple test correction)
    • Clustering and dimensional reduction (e.g. PCA)
Allison Dickey
  • More Advanced Visualization
    • Using ggplot2
    • Customizing plots
    • Spatial displays and maps in ggplot2
    • Interactive plots using plotly

Session Recording:


Download Slide Deck (PDF)
 | Download Sup
porting Files (ZIP) (right-click and save file)

Post about the program on social media and use this hashtag!
#TAMIDSBiomedicalDataScience