CS 533: Natural Language Processing (Spring 2023)
Instructor:
Karl Stratos (karl.stratos@rutgers.edu)
Lectures: Tuesday 25pm at
FBOEHA
Textbook: None. The course will use selfcontained slides/lecture notes and free online resources.
Canvas: All lectures, assignments, and projects will be managed on the
course Canvas page.
Overview
This graduatelevel course will cover technical foundations of modern natural language processing (NLP).
The course will cast NLP as an application of machine learning, in particular deep learning,
and focus on deriving general scientific and engineering principles that underlie stateoftheart NLP systems today.
Goals.
 Understanding the goals, capabilities, and principles of NLP
 Acquiring mathematical tools to formalize NLP problems
 Acquiring implementation skills to build practical NLP systems
 Obtaining an ability to critically read and accurately evaluate conference papers in NLP
Prerequisites.
No previous exposure to deep learning or NLP is assumed.
However, the course will be most beneficial for students with some programming experience and familiarity with basic concepts in probability and statistics, calculus, and linear algebra.
Examples of such concepts include
 Random variables (continuous or discrete), expectation, mean/variance
 Matrix and vector operations
 Derivatives, partial derivatives, gradients
 Programming (in Python): familiarity with data structures and algorithms
If you are an undergraduate, you must meet the requirements described in the
CS Honors Program and submit a request form.
I will not be approving requests or giving out special permission numbers until it is closer to the beginning of the semester.
The prerequisites are as follows:
 Required: M250 (linear algebra), 112 (data structures), 206 (discrete II)
 Recommended: M251 (multivariable calculus), 533 (machine learning)
 Alternatives to 206: M477 (probability), S379 (basic probability theory), or instructor's permission
Structure
Grading.
 Entrance quiz: 5%
 Assignments: 50%
 Quizzes: 15%
 Project: 30%
Entrance quiz.
There will be an entrance quiz in the first class. It will help you assess if you have a suitable technical background.
Assignments.
Assignments are the heart of this course.
There will be around 4 assignments. Each assignment will have both written and programming components.
For the written component, you are required to use LaTeX to write up your solutions.
If you have never used LaTeX before, you can pick it up quickly (
tutorial,
style guide).
For the programming component, you will implement and run your code online using the
Jupyter Notebook
on
Google Colab.
While this setup is slightly detached from the realworld setting (i.e., GitHub repositories),
it allows everyone to get started right away in a uniform software environment.
Quizzes.
There will be 3 quizzes throughout the semester each counting 5% of the grade.
Project.
In the later part of the semester, you will work on a course project in conjunction with the usual coursework.
The course will have provided basic knowledge to understand the current research landscape of the field.
You will select a recent paper (from a list prepared by the instructor, or by special permission), replicate and possibly build on its results,
present the work in class, and submit a final report.
The project will be due at the end of the semester and graded as follows:
 5minute presentation (10% of the grade): You must strictly adhere to the
paperwriting tips by Jennifer Widom
and clearly explain (1) what the problem is, (2) why it is interesting and important,
(3) why it is hard, (4) why previous approaches fail, and (5) the key components of the paper's/your approach and results.
 4page report (20% of the grade): The report will be written and evaluated like a conference paper.
Academic integrity policy.

Assignments: Collaboration is allowed and encouraged, as long as you (1) write your own solution entirely on your own, and (2) specify names of student(s) you collaborated with in your writeup.
If you find a solution online, clearly acknowledge the source and still write your own solution entirely on your own.
Copying solutions from others or from the internet is strictly prohibited.
 Quizzes: Cheating is strictly prohibited.
 Project: Collaboration up to 3 is allowed.
If the student is caught in cheating/plagiarism, the incident will be reported to the office of student conduct and he/she will get zero point for the assignment/quiz, which will result in a low final grade.
Plan
Topics.
We will first cover fundamentals of deep learning, with a special emphasis on
 The universality of neural networks
 Crossentropy loss and gradientbased optimization
 The transformer architecture
Then, we will apply these fundamentals to NLP tasks, especially focusing on the topics of
 Pretrained language models (aka. "foundation models")
 Retrievers
Most NLP tasks can be approached by applying pretrained language models and retrievers, including: all simple text classification tasks (e.g., sentiment analysis),
machine translation, summarization, entity linking, coreference resolution.
Additional topics include
 Latentvariable models
 Structured prediction problems (tagging, parsing)
Tentative schedule.
Date 
Topics 
Week 1 (Jan 17) 
General introduction, text classification, crossentropy loss 
Week 2 (Jan 24) 
Stochastic gradient descent, regularization, introduction to deep learning 
Week 3 (Jan 31) 
Deep learning continued, backpropagation 
Week 4 (Feb 7) 
Neural architectures for sequences: convolutional, recurrent, transformer 
Week 5 (Feb 14) 
Language models, sequencetosequence models, machine translation 
Week 6 (Feb 21) 
Pretrained language models, masked language modeling 
Week 7 (Feb 28) 
Retrieval from a knowledge base, noise contrastive estimation, entity retrieval 
Week 8 (Mar 7) 
Knowledgeintensive language tasks, question answering 

Spring Recess 
Week 9 (Mar 21) 
Latentvariable models, variational autoencoder 
Week 10 (Mar 28) 
Structured prediction, dynamic programming algorithms for tagging/parsing 
Week 11 (Apr 4) 
Special topics: TBD (project proposal due) 
Week 12 (Apr 11) 
Special topics: TBD 
Week 13 (Apr 18) 
Special topics: TBD 
Week 14 (Apr 25) 
Project presentations 
Online resources.
 Natural Language Processing by Jacob Eisenstein
 A Primer on Neural Network Models for Natural Language Processing by Yoav Goldberg