Ali Basirat



Information Retrieval (2020)

Resources

Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze, Introduction to Information Retrieval, Cambridge University Press. 2008.

Dan Jurafsky and James H. Martin, Speech and Language Processing (3rd ed. draft) - Ch. 18, 25

Examination

  1. Lab and assignment reports
  2. Seminar presentation on selected sections of the book
  3. Literature review
  4. Individual/group projects

Zoom invitations to the online sessions

Note: some meetings may need a password to join. The password is sent to the email address of the registered students. If you have not received the password, please send an email to the instructor.

  1. 30-March-2020: Meeting ID: 463-615-907, invitation link, detailed info

Topics and Slides
  1. Introduction and course outline

  2. Boolean Retrieval

  3. Scoring, Term Weighting & the Vector Space Model

  4. Evaluation in Information Retrieval

  5. Relevance Feedback and Query Expansion

  6. Probabilistic Information retrieval

  7. Language Models for Information Retrieval

  8. Text Classification and Naive Bayes

  9. Matrix Decomposition and Latent Semantic Indexing

  10. Vector Space Classification

  11. Information Extraction

Deadlines

The initial submission deadline for all assignments, lab reports, literature reviews, and project reports are mentioned either in this page or in the corresponding instruction. The secondary submission deadline for all the mentioned items is on the 14th of August.

Home works, Assignments, and Exercises

Prepare a report for each series of exercises here and upload your report to the student portal by the 1st of June.

Seminars

Seminars are extensions of some sections of the books. The seminars can be presented in groups of at most three students (or individually). Normally, the seminars should not take more 15 minutes. If more time is needed for some cases, it can be discussed. The group members can decide about their way of presentations. The topics listed below are what the former students presented. Note: we MAY cancel the seminars due to the coronavirus crisis.

Topic Source Group Date
Variant Tf-idf Functions Section 6.4
Neural Networks in Information Retrieval A short introduction
A broader perspective: system quality and user utility Section 8.6
Probabilistic IR: An appraisal and some extensions Section 11.4
Language modeling versus other approaches in IR - Extended language modeling approaches Sections 12.3 and 12.4
Feature selection in text classification Section 13.5

Labs
  1. Lab 1: Boolean and Ranked Retrieval - Submission deadline: 2020-04-15
  2. Lab 2: Test Collections - Submission deadline: 2020-04-27
  3. Lab 3: Evaluation - Submission deadline: 2020-05-13
Literature review

This is a group activity. The group registration should be done in the student portal by 2020-04-17 00:00 2020-04-24 00:01. Each group consisting of three students should prepare a short summary (1-2 pages) for two of the three papers listed below. The summaries should be uploaded into the student portal before 2020-06-01 00:00 and address the following items:

  1. the research questions of the paper
  2. the contributions of the paper
  3. the interesting points of the paper
  4. the unclear parts of the paper
  5. VG score: How each of the two papers can be updated with the modern NLP methods?

Here are the papers to review. You should choose two of these papers to review:

  1. Ellen M. Voorhees, Natural Language Processing for Information Retrieval, Information Extraction: Towards Scalable, Adaptable SystemsJanuary 1999 Pages 32–48
  2. Adam Berger and John Lafferty, Information retrieval as statistical translation, SIGIR '99: Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrievalAugust 1999 Pages 222–229
  3. Stefan Riezler, Alexander Vasserman, Ioannis Tsochantaridis, Vibhu Mittal, Yi Liu, Statistical Machine Translation for Query Expansion in Answer Retrieval, Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, 2007

Projects

You need to register yourself to one of the project groups consisting of at most three persons. You can work on your own topic or choose a topic from our IR project list of topics. You are strongly encouraged to develop your own ideas. The project topics have to be confirmed by the course instructor. You can choose between the two possible project deadlines, the 1st of June, or the 14the of August. By one of these deadlines, you need to upload your project reports to the student portal. The second deadline is mainly intended for you who have failed by the first deadline and need to resubmit your report. However, you can also use the second deadline if you have not submitted the report earlier. The next possibility for resubmission in case of failure is the next time the course is given.