Professional Education Workshops

Data science is a multidisciplinary field that utilizes statistics, data analysis, machine learning, algorithms, software, and computing systems to extract information, acquire knowledge, and gain insights into the underlying context from which data is generated. The Texas A&M Institute of Data Science (TAMIDS) workshop on Data Science Foundations and Computational Practice is a week-long intensive workshop that equips participants with diverse skills to enable their professional practice of data science.

This workshop is designed for organizations that aim to help their workforce build knowledge and hands-on experience in applying computational methods of data science. The program consists of ten interactive, three-hour modules that combine methodological instruction with practical application to real data through hands-on programming and computation. Participants will use open-source tools and libraries to develop proficiency in foundational and state-of-the-art data science methods. The workshop can be delivered remotely (in synchronous, asynchronous, or hybrid formats) or on-site, depending on the organization’s needs.

Participants should have prior familiarity with Linux or similar operating system environments, experience with Python or a comparable programming language, and an introductory background in data analysis and statistics. The modules are delivered by Texas A&M faculty experts and draw on materials developed and field-tested in previous data science webinars, boot camps, and tutorials. TAMIDS can also collaborate with organizations to create customized workshop content that incorporates domain-specific applications.

Learning Outcomes

Upon the completion of the course, each participant should be able to:

Create and manage an open-source software environment for data science projects.
Use open source tools to read, update, and write JSON, CSV, XML, and other structured data formats.
Apply NumPy and SciPy packages for numerical and statistical computation on data.
Gain insight into data through analysis and visualization using pandas and matplotlib open-source libraries.
Apply common supervised and unsupervised machine learning methods; identify pitfalls such as over-fitting.
Design and develop non-trivial programs for data science in Python using libraries and frameworks for machine learning and distributed computation, including scikit-learn and Spark.
Use reinforcement learning to optimize control of artificial intelligence systems in the absence of a specific reward model.
Model complex and high-dimensional data by learning latent low-dimensional representations.
Integrate deep-learning frameworks such as TensorFlow and PyTorch into data analytics workflows.
Develop models and perform feature selection using automated machine learning.

Questions about these workshops should be sent to Dr. Nick Duffield, Director of TAMIDS, via email at duffieldng@tamu.edu