COMP 469 – Artificial Intelligence / Machine Learning – Fall 2024

AI and ML have recently taken the world by storm; applications multiply, from autonomous vehicles, to ChatGPT, to recommender systems, to automated software development. This course aims to cover the latest developments in ML from an applications (rather than a theoretical) perspective, motivated by the conviction that everyone can, and should, be familiar with a technology that will deeply affect the next decades.

Course Outline: The course covers the many aspects of how human intelligence might be encoded in computer programs and machines such as robots. This includes topics in Natural Language Processing, Computer Vision, Expert Systems, and Automated Problem Solving. The course will especially concentrate on the latest developments in Machine Learning (ML), using Amazon Web Services (AWS) ML services.

Student Prerequisites: The intention of this course is to be advanced but self contained. However, some familiarity (not expertise!) in the following areas will be helpful:

  • Linear Algebra
  • Statistics and probability
  • Fundamentals of programming, especially Python
  • Understanding of basic concepts of Cloud Computing (e.g., the knowledge in AWS Certified Cloud Practitioner; at CSUCI this content is taught in COMP 347)
  • Jupyter Notebook is the preeminent interactive lab environment used for data analytics and Machine Learning prototyping; some familiarity with this environment would be helpful as it will be the main tool we use in the course, but not strictly necessary as students will have an opportunity to learn this great tool.

Course Delivery Method: Weekly online meetings outlining the content, but the student will be expected to do asynchronous learning, using the materials provided. There will be lecture slides, student guides, coding lab activities using Jupyter Notebooks, and knowledge checks. All materials will be provided to the student at no cost. We will use Jupyter Notebook on two platforms: Amazon SageMaker Studio Lab and Google Colab. Canvas pages:

Grade: TBD

Partners

As CSUCI is a partner with AWS Academy and AWS Machine Learning University, there is no cost to the student.

Student Learning Outcomes

  • Determine if a business problem is a good candidate for a machine learning solution based on problem goal, available data, scalability, and other factors.
  • Gain hands-on experience with Jupyter notebooks, Amazon SageMaker, Google Colab and Python ML libraries, which are powerful tools for developing and deploying machine learning models.
  • Learn about AutoML and AutoGluon and how they can be used to automate the tedious parts of the machine learning pipeline, freeing up time for more important tasks.
  • Understand the importance of data pre-processing and feature engineering, overfitting/under-fitting and how to avoid them by using regularization techniques.
  • Learn about different types of machine learning models, including tree-based models, regression models and ensembling models, and how to select and evaluate the best model for a given task.

Content

Part 1

  1. Introduction: what is Machine Learning?
  2. Jupyter Notebooks and Amazon SageMaker
  3. Exploratory Data Analysis
  4. Responsible ML
  5. Types of ML
  6. Overfitting and Underfitting
  7. AutoML and AutoGluon
  8. Generating batch predictions from AutoGluon models
  9. Basic Feature Engineering
  10. Tree-based models
  11. Optimization and regression
  12. Hyperparameter tuning
  13. Ensembling and Boosting
  14. Exploring bias in data and fairness metrics
  15. Implementing an ML pipeline
  16. Introducing forecasting
  17. Introducing Natural Language Processing (NLP)
  18. Introducing Computer Vision

Part 2

  1. Introduction to Deep Learning on Text and Images
  2. Introduction to Neural Networks: Layers and Activations
  3. How Neural Networks Learn
  4. First Examples of Neural Networks
  5. Building an End-to-End Neural Network Solution
  6. Neural Network Engineering
  7. Challenges of Textual Data and Domains of NLP
  8. Processing Text
  9. Word Embeddings
  10. Recurrent Neural Networks
  11. RNN Example with a Practical Dataset
  12. Transformers
  13. How Are Images Stored in a Computer?
  14. The Concept of Convolution
  15. Convolutional Neural Networks
  16. ResNet: The Trade-Offs of Depth and Model Performance
  17. Modern Architectures
  18. Transfer Learning

Schedule

MLTA stands for Machine Learning Through Application, which is Part 1 of the course, and ADLTID stands for Application of Deep Learning to Text and Image Data, which is Part 2 of the course.

Aug 27Intro to MLMLTA – M1Lab 1,2
Sep 3Intro to MLMLTA – M1Lab 3,4,5
Sep 10Tabular DataMLTA – M2Lab 1,2
Sep 17Tabular DataMLTA – M2Lab 3,4
Sep 24Tabular DataMLTA – M2Lab 5,6
Oct 1Responsible MLMLTA – M3Lab 1,2
Oct 8Responsible MLMLTA – M3Lab 3,4
Oct 15Neural NetworksADLTID – M1Lab 1,2
Oct 22Neural NetworksADLTID – M1Lab 3,4
Oct 29Text DataADLTID – M2Lab 1
Nov 5Text DataADLTID – M2Lab 2,3
Nov 12Text DataADLTID – M2Lab 4,5
Nov 19Computer VisionADLTID – M3Lab 1,2
Nov 26Computer VisionADLTID – M3Lab 3,4
Dec 3Computer VisionADLTID – M3Lab 5,6