# Statistical Methods for NLP

Credits: 7,5 hp

Syllabus: 5LN704

Teachers: Joakim Nivre, Evelina Andersson (teaching assistant)

## News

- Due to technical problems, I was not able to broadcast the third lecture in Adobe Connect. As a substitute, I have uploaded the recordings of the same lecture from last year's course. (2012-02-13)
- The lecture originally scheduled on January 30 has been moved to February 2. (2012-01-10)

## Schedule

Date | Time | Room | Content | Reading | |
---|---|---|---|---|---|

1 |
23/1 |
13-15 |
9-2029 |
Probability theory (Slides, Recording1, Recording2) |
Schay, ch. 1, 3, 4 (not 4.2-4.3) |

2 |
2/2 |
13-15 |
9-2029 |
Statistical inference (Slides, Recording1, Recording2) |
Schay, ch. 5 (not 5.3-5.6), 7 |

3 |
13/2 |
15-17 |
9-2029 |
Bayesian classification (Slides, Recording1, Recording2) |
Mitchell, 1-2.3; Jurafsky & Martin, 20.1-20.2; Androutsopoulos et al. |

4 |
16/2 |
13-15 |
9-2029 |
Hidden variables and EM (Slides, Recording1, Recording2) |
Prescher; Nigam et al. |

5 |
27/2 |
13-15 |
9-2029 |
Sequence models (Slides, Recording1, Recording2) |
Jurafsky & Martin, 5.5, 6-1-6.5 |

6 |
12/3 |
13-15 |
9-2029 |
Stochastic grammars (Slides, Recording1, Recording2) |
Jurafsky & Martin, 14.1-14.6; Prescher |

All lectures will be broadcast through SUNET's Adobe Connect server. Connect through:

Flash Player 8.0.0.0 or above is required and you will be prompted to allow an add‐in to be installed.## Intended Learning Outcomes

In order to pass the course, a student must be able to- apply basic probability theory and principles of statistical inference to natural language data,
- implement simple statistical models for classification and sequence labeling in language technology,
- construct treebank grammars for use in natural language parsing,
- apply the principles of expectation-maximization to models with hidden variables,

## Examination and Grading Criteria

The course is examined by means of four assignments: In order to pass the course, a student must pass each of one of these. In order to pass the course with distinction (Väl godkänt), a student must pass at least two assignments with distinction.## Reading List

- Androutsopoulos, I., Koutsias, J., Chandrinos, K.V., Paliouras, G. and Spyropoulos, C.D. (2000)
An Evaluation of Naive Bayesian Anti-Spam Filtering.
In
*Proceedings of the Workshop on Machine Learning in the New Information Age*, 11th European Conference on Machine Learning (ECML 2000), Barcelona, Spain, pp. 9-17. - Jurafsky, Daniel and Martin, James H. (2008)
*Speech and Language Processing*. Second Edition. Prentice Hall. - Mitchell, Tom (2005) Generative and Discriminative Classifiers: Naive Bayes and Logistic Regression. Supplementary chapter for Mitchell, Tom (1997)
*Machine Learning*. McGraw-Hill. - Nigam, K., McCallum, A., Thrun, S. and Mitchell, T. (2000) Text Classification from Labeled and Unlabeled Documents using EM.
*Machine Learning*, 39, 103-134. - Prescher, Detlef (2003) A Tutorial on the Expectation-Maximization Algorithm Including Maximum-Likelihood Estimation and EM Training of Probabilistic Context-Free Grammars. Presented at the 15th European Summer School in Logic, Language, and Information (ESSLLI 2003).
- Schay, Géza (2007)
*Introduction to Probability with Statistical Applications*. Birkhäuser.

**NB:**Schay (2007) is my suggestion for those who do not already have a book on probability theory, but any introductory textbook on the topic will do fine.

## Course Evaluation

Course evaluation questionnaire