This primer introduces the fundamentals of data science with minimal prerequisites through a set of topic presentations together with exercises in python notebooks. This primer should appeal to students, researchers, and faculty with interests in any area of Data Science who:
- Wish to learn about Data Science hands-on through code and examples, or
- Are familiar with Data Science methods but wish to learn more about computational tools.
Each topic comprises a presentation (google slides and pdf) together with Python examples in Jupiter notebooks.
1. Introduction to Data Science
Description: this topic introduces the fundamentals of Data Science and briefly reviews some basic concepts of statistics. It also gives an overview about how to have a successful Data Science project.
2. Introduction to Graph Analytics
Description: in addition to a brief introduction to Graph Theory, this topic covers the basics of graph analytics with NetworkX, a Python package for the creation, manipulation, and study of the structure, dynamics, and functions of complex networks.
3. Exploratory Data Analysis with pandas and matplotlib
Description: This topic introduces two Python packages: pandas and matplotlib to help with Exploratory Data Analysis, which is an approach to analyzing data sets to summarize their main characteristics, often with visualization methods.
4. Introduction to Machine Learning with scikit-learn
Description: This topic covers the fundamentals of machine learning methods, which use computers to predict properties of unknown data through exploring the properties of some samples of data. This webinar also introduces scikit-learn, one of the most popular open-source machine learning frameworks written in Python.
5. Introduction to Deep Learning with Keras
Description: Keras is a very popular software framework for developing deep learning models. This topic covers the basics of Deep Learning algorithms and provides hands-on instructions to build a non-trivial image classification model with Keras.
6. Introduction to Natural Language Processing
Description: Natural language processing (NLP) is about utilizing computers to process and analyze natural language data. This topic covers the basics of natural language processing methods and provides hands-on instructions to analyze natural language data.
7. Computer Vision with Pytorch and Its Applications
Description: Convolutional Neural Network (CNN) technique has been applied in many fields including self-driving cars, medical diagnoses, and smart monitoring. This module covers PyTorch, one of the popular Python machine learning packages, with an emphasis on established CNN architectures and pre-trained weights. A hands-on dog image classification session will walk through data collection, model modification, and performance analysis.