CS 445: Machine Learning (Fall 2020)


Instructor: Karl Stratos (karl.stratos@rutgers.edu)
TA: Shuning Jin (shuning.jin@rutgers.edu)
Instructor office hours: Tuesday 4-5pm, or by appointment
TA office hours: Monday 5-6pm, Wednesday 4-5pm, or by appointment


Course format. The course will be asynchronous remote. Instead of having specific lecture times, there will be constant online interactions on Canvas along with weekly virtual office hours. The course will nonetheless be highly structured. Here is what to expect:
  1. A video lecture will be uploaded on Canvas every Tuesday and Thursday morning. An advantage of this format is that the student can watch the lecture at his/her own convenience.
  2. Along with lectures, there will be optional reading from publicly available online textbooks and other resources (the slides are self-contained).
  3. The course will be driven by written/programming assignments based on the materials covered in the lectures. A student can ask questions regarding lectures/assignments anytime on Canvas and in the virtual office hours. There will be a dedicated turn-around time of two days (if not sooner) for any question posted on Canvas. To guarantee fast turn-around and also to involve other students, please use the Canvas discussion board and only use other means of communication like email for private matters.
  4. In addition to assignments, there will be a series of short timed quizzes and a take-home open-book final exam, as well as an entrance quiz.
Entrance quiz. This is a technical course. On the first day of the class (Sep 1) at a designated time, we will have a 40-minute entrance quiz on Canvas to assess if the student has a suitable background. It will provide a clear idea as to whether the student should take the course or drop it. The entrance quiz will count as 5% of the final grade.

Academic integrity policy.
  1. Assignments: collaboration is allowed and encouraged, as long as you (1) write your own solution entirely on your own, and (2) specify names of student(s) you collaborated with in your writeup. If you find a solution online, clearly acknowledge the source and still write your own solution entirely on your own. Copying solutions from others or from the internet is strictly prohibited.
  2. Quizzes and final: cheating is strictly prohibited.
If the student is caught in cheating/plagiarism, the incident will be reported to the office of student conduct and he/she will get zero point for the assignment/quiz/exam, which will result in a low final grade.


Course description. This course is a rigorous introduction to machine learning aimed at advanced undergraduate students in computer science, mathematics, and statistics. Machine learning is a vast field that requires years of hard work to master. The course is designed to provide a solid starting point by focusing on timeless technical foundations.

Goals.
  1. Understanding the goals, capabilities, and principles of machine learning
  2. Acquiring mathematical tools to formalize machine learning problems
  3. Acquiring implementation skills to build practical machine learning systems
Audience and prerequisites. No previous exposure to machine learning is assumed. However, the course will be most beneficial for students with some programming experience and familiarity with basic concepts in probability and statistics, calculus, and linear algebra. Examples of such concepts include
  • Random variables (continuous or discrete), expectation, mean/variance
  • Matrix and vector operations
  • Derivatives, partial derivatives, gradients
  • Programming (in Python): familiarity with data structures and algorithms
More specifically, prerequisites are as follows:
  • Required: M250 (linear algebra), 112 (data structures), 206 (discrete II)
  • Recommended: M251 (multivariable calculus)
  • Alternatives to 206: M477 (probability), S379 (basic probability theory), or instructor's permission
The entrance quiz will be useful for evaluating whether the student meets the prerequisites.

Grading.
  1. Assignments: 50%
  2. Entrance quiz: 5%
  3. Quizzes (excluding the entrance quiz): 15%
  4. Final (take-home and open-book): 30%
The assignment report must be written in LaTeX using a provided assignment report template. If you have never used LaTeX before, you can pick it up quickly (tutorial, style guide).

Online textbooks (for optional reading).
  1. Pattern Recognition and Machine Learning (Bishop, 2006)
  2. Machine Learning: A Probabilistic Perspective (Murphy, 2012)
  3. Foundations of Machine Learning (Mohri, Rostamizadeh, and Talwalkarby, 2018)
Tentative plan.
Date Topics
Week 1 (Sep 1, Sep 3) General Introduction, Review of Prerequisites
Week 2 (Sep 8, Sep 10) Regression, Nearest Neighbors, Least Squares, Risk, Empirical Risk Minimization
Week 3 (Sep 15, Sep 17) Linear Regression, Maximum Likelihood Estimation, Gradient Descent
Week 4 (Sep 22, Sep 24) Generlized Linear Regression, Overfitting, Regularization
Week 5 (Sep 29, Oct 1) Error Decomposition, Bias-Variance Tradeoff
Week 6 (Oct 6, Oct 8) Classification, Logistic Regression, Stochastic Gradient Descent
Week 7 (Oct 13, Oct 15) Large Margin Learning, Representer Theorem
Week 8 (Oct 20, Oct 22) Support Vector Machines (SVMs), Kernel Machines
Week 9 (Oct 27, Oct 29) Decision Trees, Ensemble Methods
Week 10 (Nov 3, Nov 5) Generative Classifiers, Mixture Models
Week 11 (Nov 10, Nov 12) Expectation Maximization (EM), Evidence Lower Bound
Week 12 (Nov 17, Nov 19) Deep Learning, Backpropagation
Week 13 (Nov 24) Neural Architectures for Language, Speech, and Vision
Week 14 (Dec 1, Dec 3) Dimensionality Reduction, Clustering
Week 15 (Dec 8, Dec 10) Online Learning, Review, Other Topics in Machine Learning
Final (Dec 15) 48-Hour Take-Home Open-Book Final