Lab 1, Beam Search Decoding
In this lab, you will have a better understanding of beam search decoding in NMT by implementing the codes.
The lab work can be performed individually or in pairs. It is not necessary to work in the same pairs during all assignments. Note that both students in the pair should be able to independently discuss your findings!
Create a new directory for this assignment and copy all the files into this new work directory:
cp /local/kurs/mt/lab1/* .
Set the working environment:
- beam_search.py: The python file for beam search decoding, you need to implement the beam search decoding function in this file.
Compared to 1-best (greedy) decoding, beam search decoding generally gives better translations and is widely used in MT.
Read the lecture on decoding and section 13.5.4 of Koehn13, complete the beam search decoding function in NMT. In the python file, the
BeamNode class is used to represent each node in the beam search decoding. It has a set of values for beam search, such as the probability of the current hypothesis, the previous node, and the word of this node. Here are two additional videos for beam search from Andrew Ng: Video 1, Video 2.
Here is a summary of the beam search decoding process:
- start from the symbol,
- one step computation in the decoder, get the topk outputs, k=beam_size
- for each hypothesis/candidate, get the new topk outputs
- select topk candidates from the topk*beam_size candidates
- if you got a symbol, you get one completed translation, otherwise redo step 3 and 4
- stop the decoding when you get beam_size translations or the length of hypothesis is longer than MAX_LENGTH
- length normalization over the generated translations
- based on the last nodes with the highest score in the decoding paths, back trace all the preceeding nodes and get the final translation
You need to implement the code of the
beam_decoding() function. In addtion, you also need to write the comments for important codes.
If implementing the code is too hard for you. You have the second choice: write the pseudo-code and draw the decoding steps of an example. You can choose the example by yourself, but the beam size is 3 and the three target predictions' length is 3, 4, and 5, resepectively. You need to draw the process from step 1 to the end, and write a short decription for each step. You can draw the steps on a paper, take a photo and put it into the report.
- Do we use beam search decoding during training? And why?
- Is beam search decoding algorithm guranteed to find the optimal translation? Why?
- Please describe the beam search decoding in SMT and write the differences with beam search decoding in NMT.
You do not need to finish this lab in the lab session. In addition to the python file with the implemented code, each group also needs to submit a written report. You should write some thoughts about your implementation and also answer the questions in the discussion section. If you work in groups, you can submit the same python file, the report should be a little bit different. (Both of you need to submit the code and report, and please specify your group as well.) If you choose not implement the code, you also need submit the pseudo-code and the illustration of beam search decoding steps.
You should hand in these files in studentportalen. The deadline for this lab is October 14.