CS 533: Natural Language Processing (NLP)
: Karl Stratos
: Zuohui Fu
(office hours: Tuesday 3:30-4:30pm, Hill 273)
Time and location
: Wednesday 12-3pm at BE 252
Instructor office hours
: Wednesday 3:20-4:30pm at Tillett 111H
This project-centered graduate course will cover technical foundations of modern NLP.
Students are expected to start working on course projects immediately from the beginning of the course and throughout,
culminating in (1) in-class project presentations and (2) written reports that aspire to conference publication level.
The course will have two parts that happen in parallel.
The first part is standard lecture-based classes in which the instructor exposes students to fundamental concepts and applications in the field.
The second part is continual discussions and brainstorming about course projects and self-initiated research efforts.
There is no required textbook: all materials are publicly available online resources.
Please use the Canvas site
to ask questions regarding lectures/homeworks/projects, to submit assignments, and to find announcements.
Audience and prerequisites.
- Achieving an understanding of the foundational concepts and tools used in modern NLP
- Obtaining an ability to critically read and accurately evaluate conference papers in NLP
- Finding new research projects that persist beyond this course
No previous exposure to NLP is assumed. However, this is a fast-paced course designed for self-motivated graduate or advanced undergraduate students with a solid technical background in probability and statistics, calculus, and linear algebra.
Technical requirements include:
- Probabilistic reasoning (e.g., What is the conditional probability of Y=y given X=x, assuming the knowledge of a joint distribution over X and Y?)
- Intimate and intuitive understanding of matrix and vector operations (e.g., What is the shape of a matrix product? How similar are two vectors?)
- Mathematical notions in optimization (e.g., What does it mean for a function to have zero derivative at a certain point?)
If you cannot complete A1
comfortably, you may need to consult with the instructor about whether your background meets the prerequisites.
Significant programming experience in Python is necessary for programming assignments and course projects.
- Project: 40% (written report 30%, presentation 10%)
- Exam (in-class and open book): 30%
- Assignments: 20%
- Participation: 10%
The assignment report must be written in LaTeX using the provided assignment report template
Similarly, the project report must be written in LaTeX using the provided project report template
and will be reviewed by the instructor like a conference submission.
- Proposal (due 3/24) : submit an initial proposal using this template.
- Milestone (due 4/15): submit an informal 1-2 page progress report.
- Presentation (tentatively 4/29): in-class presentation
- Final report (due 5/4): submit a final report
|Week 1 (January 22)
||Logistics, Introduction, Language Modeling
Michael Collins' notes on n-gram models and log-linear models
||A1 [code] (Due 2/4)
|Week 2 (January 29)
||Deep Learning for NLP: Neural Language Modeling
Colah's blogs on deep learning and LSTMs,
NLM papers using feedforward (Bengio et al., 2003), recurrent (Mikolov et al., 2010; Melis et al., 2018),
and attention-based (GPT-2) architectures
|Week 3 (February 5)
||Deep Learning for NLP: Conditional Neural Language Modeling
||BLEU, input-feeding attention, Google's NMT,
||A2 [code] (Due 2/18)
|Week 4 (February 12)
||Deep Learning for NLP: Backpropagation, Self-Attention, Representation Learning by Language Modeling
||Backpropagation, Transformer (note), ELMo, BERT
|Week 5 (February 19)
||Structured Prediction in NLP: Tagging
Michael Collins' notes on
HMMs, CRFs, and forward-backward,
neural architectures for sequence labeling (Collobert et al., 2011; Lample et al., 2016)
||A3 [code] (Due 3/10)
|Week 6 (February 26)
||Structured Prediction in NLP: Constituency and Dependency Parsing
constituency parsing (Michael Collins' notes on PCFGs and inside-outside algorithm, Kitaev and Klein, 2018),
transition-based dependency parsing (Nivre, 2008, Chen and Manning, 2014)
graph-based dependency parsing (Eisner, 1996, Kiperwasser and Goldberg, 2016)
|Week 7 (March 4)
||Unsupervised Learning in NLP: Latent-Variable Models and the EM Algorithm, Variational Autoencoders
|Week 8 (March 11)
|Week 9 (March 25)
||Proposal due 3/24
|Week 10 (April 1)
||Special Topics: TBD (Dialogue)
|Week 11 (April 8)
||Special Topics: TBD (Question Answering)
|Week 12 (April 15)
||Special Topics: TBD (Grounding)
||Milestone due 4/14
|Week 13 (April 22)
||Special Topics: TBD (Maximal Mutual Information Representation Learning)
|Week 14 (April 29)
- Speech and Language Processing (3rd edition) by Dan Jurafsky and James H. Martin
- A Primer on Neural Network Models for Natural Language Processing by Yoav Goldberg
- Natural Language Processing by Jacob Eisenstein