C S 12A: INTRODUCTION TO MACHINE LEARNING
Foothill College Course Outline of Record
Heading | Value |
---|---|
Effective Term: | Summer 2025 |
Units: | 4.5 |
Hours: | 4 lecture, 2 laboratory per week (72 total per quarter) |
Prerequisite: | C S 3A. |
Advisory: | Students will benefit from prior exposure to statistics. |
Degree & Credit Status: | Degree-Applicable Credit Course |
Foothill GE: | Non-GE |
Transferable: | CSU/UC |
Grade Type: | Letter Grade (Request for Pass/No Pass) |
Repeatability: | Not Repeatable |
Student Learning Outcomes
- Explain the difference between the supervised, unsupervised, and reinforcement learning paradigms, and define key algorithms and models within each paradigm
- Use Python packages to train machine learning models and evaluate their performance using metrics appropriate to the task (e.g., accuracy, precision, recall)
- Understand how dataset quality can contribute to or detract from equitable model outcomes, and the ramifications of these outcomes on society
Description
Course Objectives
The student will be able to:
- Explain the role of statistics and linear algebra in artificial intelligence
- Use Python packages such as Pandas and Matplotlib to explore datasets prior to implementing a machine learning model
- Describe the difference between regression and classification
- Implement data set splitting and cross-validation and explain the value of these practices
- Evaluate model performance, and explain underfitting, overfitting, and the bias-variance tradeoff
- Independently implement at least one model in each of supervised, unsupervised, and reinforcement learning
- Use Python packages such as scikit-learn, Keras, TensorFlow, and PyTorch to develop machine learning models
- Discuss security, ethics, and equity in the context of machine learning, including explainability, accountability, and interpretability
Course Content
- Mathematical foundations
- Statistics
- Vectors
- Matrices
- Preparing data
- Explore datasets and formulate a question that machine learning can solve
- Visualizing
- Cleaning
- Encoding
- Curse of dimensionality
- PCA
- Data pitfalls
- Noise
- Techniques of regression and classification
- Categorize machine learning problems as regression or classification
- Preparing a dataset for validation
- Split a dataset into a train set and a test set
- Partition a dataset for k-fold cross validation
- Error analysis and performance evaluation
- RMSE
- Bias and variance
- Overfitting and underfitting
- Confusion matrices
- Precision, recall, and F1
- Machine learning models
- Supervised learning
- Common features and techniques
- Hyperparameters
- Loss functions
- Gradient descent
- Linear regression
- Logistic regression
- K-nearest neighbor
- Support vector machines
- Decision trees
- Entropy
- Information gain
- Ensemble learning
- Bagging
- Boosting
- Naïve Bayes
- Artificial neural networks
- Common features and techniques
- Unsupervised learning
- Clustering
- Reinforcement learning
- Markov decision process
- Q-learning
- Deep learning
- Supervised learning
- Machine learning packages
- Installation and importing
- Documentation and API
- Machine learning pipeline
- Safety and ethics of machine learning
- Accountability
- Interpretability
- Explainability
- Disproportionate impact
- Historical examples of racism and other harmful bias in machine learning
- Preventing racism and other harmful bias in machine learning
Lab Content
- Familiarity with package installation
- Installation
- Importing
- Documentation
- Exploration of data table and graphing packages
- Identify and address incomplete data
- Find key statistics from data tables
- Generate a pair-plot to visually explore co-variance
- Prepare a dataset for machine learning
- Identify and address missing information
- Identify extraneous columns
- Quantify qualitative data where possible and necessary
- Perform binary or class encoding
- Apply function transformations to normalize data
- Apply principal component analysis to reduce dimensionality
- Split data into testing and training sets
- Split data into k-fold sets
- Use machine learning packages to develop and test at least one model in each domain:
- Supervised learning
- Unsupervised learning
- Reinforcement learning
- Use Python packages or student code to produce error reports
- RMSE
- Confusion matrix
- Precision, recall, and F1
Special Facilities and/or Equipment
2. The college will provide a website or course management system with an assignment posting component (through which all lab assignments are to be submitted) and a forum component (where students can discuss course material and receive help from the instructor). This applies to all sections, including on-campus (i.e., face-to-face) offerings.
3. When taught online, the college will provide a fully functional and maintained course management system through which the instructor and students can interact.
4. When taught online, students must have currently existing email accounts and ongoing access to computers with internet capabilities.
Method(s) of Evaluation
Tests and quizzes
Lab notebook
Written laboratory assignments which include source code, sample runs, and documentation
Reflective papers
Final examination or project
Method(s) of Instruction
Instructor-authored lectures which include mathematical foundations, theoretical motivation, and coding implementation of machine learning algorithms
Detailed review of assignments which includes model solutions and specific comments on the student submissions
Discussion which engages students and instructor in an ongoing dialog about machine learning
Instructor-authored labs that rigorously demonstrate a student's ability to implement machine learning models
Representative Text(s) and Other Materials
Géron, Aurélien. Hands-On Machine Learning with Scikit-Learn, Keras and TensorFlow. 2022.
Raschka, Sebastian, Yuxi (Hayden) Liu, and Vahid Mirjalili. Machine Learning with PyTorch and Scikit-Learn. 2022.
Types and/or Examples of Required Reading, Writing, and Outside of Class Assignments
- Reading
- Textbook assigned reading averaging 30 pages per week
- Reading the supplied handouts and modules averaging 10 pages per week
- Reading online resources as directed by instructor though links pertinent to programming
- Reading library and reference material directed by instructor through course handouts
- Writing
- Writing technical prose documentation that supports and describes the programs that are submitted for grades