GSLT: Machine Learning
The purpose of this course is to give a solid methodological
foundation in machine learning for language technology,
an overview of the most widely used approaches to learning,
and an in-depth understanding of a subset of these approaches.
The course is aimed at students with a basic knowledge of
natural language processing and/or speech technology (at least
the equivalent of a GSLT level 1 course in one of these areas,
see NLP,
Speech
technology). Basic programming skills are useful as well as
a rudimentary knowledge of basic statistics and probability theory.
The course consists of three parts:
- The first part of the course (taught in the first intensive week)
gives a general introduction to machine learning, covering basic
methodological principles and introducing the major learning paradigms
used in language technology.
- The second part of the course (taught in the second intensive
week) consists of two advanced tutorials on specific learning methods:
- Generalized linear classifiers
- Memory-based learning
Each tutorial will include a lab session with an assignment to be completed
after the session.
- The third part of the course is a practical project (reported at the
closing seminar) applying one or more of the learning methods covered in the
course to one or more areas of language technology.
The assignments for part one and two should be carried out in groups of
two students. The final project may be carried out in groups of two students.
The main text book for the course is Mitchell (1997)
Machine Learning,
which will be used especially in the first part of the course.
Students are recommended to read at least chapters 1 and 5 before
the first intensive week.
NB:
The official language within GSLT is English but we can decide to have
lectures, seminars and discussions in Swedish instead,
provided of course that all participants
are comfortable with this. In any case, participants are free to formulate
their contributions to discussions, whether oral or written, in any language
that can be understood by the other participants (which in most
circumstances means Swedish or English).
Schedule
Lectures/Tutorials
Study Periods
- The period between the first and second intensive weeks will be devoted to one assignment:
- Read the text book and do the written assignment.
Deadline: 18 October
- The first period after the second intensive week will be devoted to three tasks:
- Complete the GLC assignment and write a short report.
- Complete the MBL assignment and write a short report.
- Write a proposal for the final project (maximum 1 page).
Deadline: 22 November
- The second period after the second intensive week will be devoted to the course project, which
is to be reported in a term paper.
Deadline: 20 December
Closing Seminar
The closing seminar will take place at Uppsala University
14-15 January 2010.