Lab 1, Beam Search Decoding

Aim

In this lab, you will have a better understanding of beam search decoding in NMT by implementing the codes.

Practicalities

The lab work can be performed individually or in pairs. It is not necessary to work in the same pairs during all assignments. Note that both students in the pair should be able to independently discuss your findings!

Data

Create a new directory for this assignment and copy all the files into this new work directory:

mkdir lab1/
cd lab1/
cp /local/kurs/mt/lab1/* .

Files description:

Set the working environment:
source ~/envNMT/bin/activate

Implementation

Compared to 1-best (greedy) decoding, beam search decoding generally gives better translations and is widely used in MT. Read the lecture on decoding and section 13.5.4 of Koehn13, complete the beam search decoding function in NMT. In the python file, the BeamNode class is used to represent each node in the beam search decoding. It has a set of values for beam search, such as the probability of the current hypothesis, the previous node, and the word of this node. Here are two additional videos for beam search from Andrew Ng: Video 1, Video 2.

Here is a summary of the beam search decoding process:

  1. start from the symbol,
  2. one step computation in the decoder, get the topk outputs, k=beam_size
  3. for each hypothesis/candidate, get the new topk outputs
  4. select topk candidates from the topk*beam_size candidates
  5. if you got a symbol, you get one completed translation, otherwise redo step 3 and 4
  6. stop the decoding when you get beam_size translations or the length of hypothesis is longer than MAX_LENGTH
  7. length normalization over the generated translations
  8. based on the last nodes with the highest score in the decoding paths, back trace all the preceeding nodes and get the final translation

You need to implement the code of the beam_decoding() function. In addtion, you also need to write the comments for important codes.

If implementing the code is too hard for you. You have the second choice: write the pseudo-code and draw the decoding steps of an example. You can choose the example by yourself, but the beam size is 3 and the three target predictions' length is 3, 4, and 5, resepectively. You need to draw the process from step 1 to the end, and write a short decription for each step. You can draw the steps on a paper, take a photo and put it into the report.

Discussion

Reporting

You do not need to finish this lab in the lab session. In addition to the python file with the implemented code, each group also needs to submit a written report. You should write some thoughts about your implementation and also answer the questions in the discussion section. If you work in groups, you can submit the same python file, the report should be a little bit different. (Both of you need to submit the code and report, and please specify your group as well.) If you choose not implement the code, you also need submit the pseudo-code and the illustration of beam search decoding steps.

You should hand in these files in studentportalen. The deadline for this lab is October 14.