CS 533: Natural Language Processing (Spring 2023)

Instructor: Karl Stratos (karl.stratos@rutgers.edu)
Lectures: Tuesday 2-5pm at FBO-EHA
Textbook: None. The course will use self-contained slides/lecture notes and free online resources.
Canvas: All lectures, assignments, and projects will be managed on the course Canvas page.

Overview

This graduate-level course will cover technical foundations of modern natural language processing (NLP). The course will cast NLP as an application of machine learning, in particular deep learning, and focus on deriving general scientific and engineering principles that underlie state-of-the-art NLP systems today.

Goals.

Understanding the goals, capabilities, and principles of NLP
Acquiring mathematical tools to formalize NLP problems
Acquiring implementation skills to build practical NLP systems
Obtaining an ability to critically read and accurately evaluate conference papers in NLP

Prerequisites. No previous exposure to deep learning or NLP is assumed. However, the course will be most beneficial for students with some programming experience and familiarity with basic concepts in probability and statistics, calculus, and linear algebra. Examples of such concepts include

Random variables (continuous or discrete), expectation, mean/variance
Matrix and vector operations
Derivatives, partial derivatives, gradients
Programming (in Python): familiarity with data structures and algorithms

If you are an undergraduate, you must meet the requirements described in the CS Honors Program and submit a request form. I will not be approving requests or giving out special permission numbers until it is closer to the beginning of the semester. The prerequisites are as follows:

Required: M250 (linear algebra), 112 (data structures), 206 (discrete II)
Recommended: M251 (multivariable calculus), 533 (machine learning)
Alternatives to 206: M477 (probability), S379 (basic probability theory), or instructor's permission

Structure

Grading.

Entrance quiz: 5%
Assignments: 50%
Quizzes: 15%
Project: 30%

Entrance quiz. There will be an entrance quiz in the first class. It will help you assess if you have a suitable technical background.

Assignments. Assignments are the heart of this course. There will be around 4 assignments. Each assignment will have both written and programming components. For the written component, you are required to use LaTeX to write up your solutions. If you have never used LaTeX before, you can pick it up quickly (tutorial, style guide). For the programming component, you will implement and run your code online using the Jupyter Notebook on Google Colab. While this setup is slightly detached from the real-world setting (i.e., GitHub repositories), it allows everyone to get started right away in a uniform software environment.

Quizzes. There will be 3 quizzes throughout the semester each counting 5% of the grade.

Project. In the later part of the semester, you will work on a course project in conjunction with the usual coursework. The course will have provided basic knowledge to understand the current research landscape of the field. You will select a recent paper (from a list prepared by the instructor, or by special permission), replicate and possibly build on its results, present the work in class, and submit a final report. The project will be due at the end of the semester and graded as follows:

5-minute presentation (10% of the grade): You must strictly adhere to the paper-writing tips by Jennifer Widom and clearly explain (1) what the problem is, (2) why it is interesting and important, (3) why it is hard, (4) why previous approaches fail, and (5) the key components of the paper's/your approach and results.
4-page report (20% of the grade): The report will be written and evaluated like a conference paper.

Academic integrity policy.

Assignments: Collaboration is allowed and encouraged, as long as you (1) write your own solution entirely on your own, and (2) specify names of student(s) you collaborated with in your writeup. If you find a solution online, clearly acknowledge the source and still write your own solution entirely on your own. Copying solutions from others or from the internet is strictly prohibited.
Quizzes: Cheating is strictly prohibited.
Project: Collaboration up to 3 is allowed.

If the student is caught in cheating/plagiarism, the incident will be reported to the office of student conduct and he/she will get zero point for the assignment/quiz, which will result in a low final grade.

Plan

Topics. We will first cover fundamentals of deep learning, with a special emphasis on

The universality of neural networks
Cross-entropy loss and gradient-based optimization
The transformer architecture

Then, we will apply these fundamentals to NLP tasks, especially focusing on the topics of

Pretrained language models (aka. "foundation models")
Retrievers

Most NLP tasks can be approached by applying pretrained language models and retrievers, including: all simple text classification tasks (e.g., sentiment analysis), machine translation, summarization, entity linking, coreference resolution. Additional topics include

Latent-variable models
Structured prediction problems (tagging, parsing)

Tentative schedule.

Date	Topics
Week 1 (Jan 17)	General introduction, text classification, cross-entropy loss
Week 2 (Jan 24)	Stochastic gradient descent, regularization, introduction to deep learning
Week 3 (Jan 31)	Deep learning continued, backpropagation
Week 4 (Feb 7)	Neural architectures for sequences: convolutional, recurrent, transformer
Week 5 (Feb 14)	Language models, sequence-to-sequence models, machine translation
Week 6 (Feb 21)	Pretrained language models, masked language modeling
Week 7 (Feb 28)	Retrieval from a knowledge base, noise contrastive estimation, entity retrieval
Week 8 (Mar 7)	Knowledge-intensive language tasks, question answering
	Spring Recess
Week 9 (Mar 21)	Latent-variable models, variational autoencoder
Week 10 (Mar 28)	Structured prediction, dynamic programming algorithms for tagging/parsing
Week 11 (Apr 4)	Special topics: TBD (project proposal due)
Week 12 (Apr 11)	Special topics: TBD
Week 13 (Apr 18)	Special topics: TBD
Week 14 (Apr 25)	Project presentations

Online resources.

Natural Language Processing by Jacob Eisenstein
A Primer on Neural Network Models for Natural Language Processing by Yoav Goldberg