CS 533: Natural Language Processing (Spring 2021)


Instructor: Karl Stratos (karl.stratos@rutgers.edu)
TA: TBD
Instructor office hours: Tuesday 4-5pm, or by appointment
TA office hours: TBD


Course format. The course will be asynchronous remote. Instead of having specific lecture times, there will be constant online interactions on Canvas along with weekly virtual office hours. The course will nonetheless be highly structured. Here is what to expect:
  1. One or more video lectures will be uploaded on Canvas every week. An advantage of this format is that you can watch the lecture at your own convenience.
  2. Along with lectures, there will be optional reading from publicly available online textbooks and other resources (the slides are self-contained).
  3. The course will be driven mainly by written/programming assignments based on the materials covered in the lectures. You can ask questions regarding lectures/assignments anytime on Canvas and in the virtual office hours. There will be a dedicated turn-around time of two days (if not sooner) for any question posted on Canvas. To guarantee fast turn-around and also to involve other students, please use the Canvas discussion board and only use other means of communication like email for private matters.
  4. In addition to assignments, there will be an entrance quiz, a series of short timed quizzes, and a project.
Entrance quiz. This is a technical course. On the first day of the class at a designated time, we will have a timed entrance quiz on Canvas to assess if you have a suitable background. The entrance quiz will count as 5% of the final grade.

Project. The last month of the course will be used for projects. The lectures will present selected conference papers to illustrate the current research landscape of the field. You will select a recent paper (approved by the instructor as acceptable), replicate and possibly build on its results, and submit a short video presentation and a final report of your findings.

Academic integrity policy.
  1. Assignments: collaboration is allowed and encouraged, as long as you (1) write your own solution entirely on your own, and (2) specify names of student(s) you collaborated with in your writeup. If you find a solution online, clearly acknowledge the source and still write your own solution entirely on your own. Copying solutions from others or from the internet is strictly prohibited.
  2. Quizzes: cheating is strictly prohibited.
  3. Project: collaboration up to 3 is allowed.
If the student is caught in cheating/plagiarism, the incident will be reported to the office of student conduct and he/she will get zero point for the assignment/quiz/exam, which will result in a low final grade.


Course description. This graduate-level course will cover technical foundations of modern natural language processing (NLP). The course will cast NLP as an application of machine learning, in particular deep learning, and focus on deriving general mathematical principles that underlie state-of-the-art NLP systems today.

Goals.
  1. Understanding the goals, capabilities, and principles of NLP
  2. Acquiring mathematical tools to formalize NLP problems
  3. Acquiring implementation skills to build practical NLP systems
  4. Obtaining an ability to critically read and accurately evaluate conference papers in NLP

Audience and prerequisites. No previous exposure to NLP is assumed. However, the course will be most beneficial for students with some programming experience and familiarity with basic concepts in probability and statistics, calculus, and linear algebra. Examples of such concepts include
  • Random variables (continuous or discrete), expectation, mean/variance
  • Matrix and vector operations
  • Derivatives, partial derivatives, gradients
  • Programming (in Python): familiarity with data structures and algorithms
If you are an undergraduate, prerequisites are as follows:
  • Required: M250 (linear algebra), 112 (data structures), 206 (discrete II)
  • Recommended: M251 (multivariable calculus), 533 (machine learning)
  • Alternatives to 206: M477 (probability), S379 (basic probability theory), or instructor's permission
The entrance quiz will be useful for evaluating whether you meet the prerequisites.

Grading.
  1. Assignments: 50%
  2. Entrance quiz: 5%
  3. Quizzes (excluding the entrance quiz): 15%
  4. Project: 30%
The assignment report must be written in LaTeX using a provided assignment report template. Similarly, the project report must be written in LaTeX using a provided project report template. If you have never used LaTeX before, you can pick it up quickly (tutorial, style guide). The project will be due at the end of the semester and graded as follows:
  • 10-minute presentation: 1/3 of the project grade. The structure should strictly adhere to paper-writing tips by Jennifer Widom and clearly explain (1) what the problem is, (2) why it is interesting and important, (3) why it is hard, (4) why previous approaches fail, and (5) what the key components of your approach and results are.
  • 4-page report (excluding references): 2/3 of the project grade. This should flesh out the presentation with details like a conference paper.


Tentative plan.
Date Topics
Week 1 (Jan 18) General introduction and review of prerequisites
Week 2 (Jan 25) Linear classification: score function, softmax, loss function, stochastic gradient descent, regularization
Week 3 (Feb 1) Deep learning: nonliniearity and universality of neural networks, backpropagation
Week 4 (Feb 8) Sequence-to-sequence models: chain rule, perplexity, training and beam search
Week 5 (Feb 15) Neural architectures for sequences: convolutional, recurrent, and transformer
Week 6 (Feb 22) Structured prediction: Markov random fields, conditional random fields, Viterbi, CKY
Week 7 (Mar 1) Self-supervised representation learning: word embeddings, contextual word embeddings, masked language modeling
Week 8 (Mar 8) Unsupervised learning: latent-variable models, expectation maximization, variational autoencoders
Week 9 (Mar 22) Information extraction: knowledge base, entity linking, relation extraction, question answering
Week 10 (Mar 29) Text generation: translation, summarization, image captioning, data-to-text generation, dialogue
Week 11 (Apr 5) Project phase: selected conference paper
Week 12 (Apr 12) Project phase: selected conference paper
Week 13 (Apr 19) Project phase: selected conference paper
Week 14 (Apr 26) Project phase: selected conference paper


Online resources.
  1. Natural Language Processing by Jacob Eisenstein
  2. A Primer on Neural Network Models for Natural Language Processing by Yoav Goldberg