Natural Language Processing 1 (2015)


Lecturer: Ivan Titov (titov at uva.nl)
Assistants: Sophie Arnoult, Joachim Daiber, Ehsan Khoddam Mohammadi ({s.arnoult | Daiber.Joachim | E.khoddammohammadi}) at gmail.com
All assistants: nlp.uva.2015 at gmail.com

Short Description

This class is an introduction to statistical natural language processing (NLP) for graduate students. The goal is to introduce the students to key challenges and foundational methods of NLP. Specifically, we will study syntactic parsing for constituent and dependency representations, look into shallow representations of semantics (semantic role labeling), topic models and distributional semantics methods. Several lectures will cover important NLP applications such as statistical machine translation and summarization. We will also consider some background from machine learning (specifically, discrimantive and generative models of structures, latent variable models and, time permitting, Bayesian modeling methods and representation learning techniques) crucial in modern NLP.

Blackboard will be used for semi-urgent up-to-date information. The only exception are lecture slides and reading recommendations: they will be posted here.

Reading

We will use Jurafsky and Martin's "Speech and Language Processing" (Edition 2) as the main text book. Sections / chapters related to specific lectures are listed below. Optionally, I would suggest to consider the Manning and Schuetze textbook "Statistical Natural Language Processing". Nevertheless, much of the material presented in the lectures is not available in any of them.


Grading

Grading components:


Project


Assignments

The will be four (no-programming) assignments. They will be posted on Blackboard in due time. The submission procedures will be described there as well.
Lectures

Oct 26 Introduction to NLP, project discussion
Oct 29 Topic models (start) Not in the textbook, suggested extra reading: PLSA, LDA, Gibbs sampling for LDA
Nov 2 Topic models
Nov 5 Applications / generalizations of Topic models, Hidden Markov Models, decoding algorithm (Viterbi) Reading: J&M 5.1 - 5.5; 6.1 - 6.4
Nov 9 Hidden Markov Models: discriminative estimation (structured perceptron), unsupervised estimation (forward-backward) Reading: from Nov 5 plus J&M 12.1-4, 13.1-4, 14.1-7;
Nov 12 Hidden Markov Models: discriminative estimation (CRF), neural sequence models (RNNs / encoder-decoder);
Nov 16 Syntactic (constutuent) parsing Reading: J&M 12.1-4, 13.1-4, 14.1-7;
Nov 23 Syntactic (constituent) parsing (continued), dependency syntax Reading: J&M 12.1-4, 13.1-4, 14.1-7;
Nov 26 Syntactic dependency parsing ( animation of transition-based parsing)
Nov 30 Distributional semantics, preliminary set of topics for the exam
Dec 4 Machine translation (slides from previous year) Reading: J&M 25;
Dec 7 Machine translation (no slides for the moment, not part of the exam)
Dec 10 Towards machine reading and reasoning

Lecture slides will be made downloadable (after each lecture).