In this lab, you will have a better understanding of beam search decoding in NMT by implementing the codes.

`mkdir lab1/ `

`cd lab1/ `

`cp /local/kurs/mt/lab1/* . `

Files description:

```
```

- beam_search.py: The python file for beam search decoding, you need to implement the beam search decoding function in this file.

```
```

`source ~/envNMT/bin/activate`

Compared to 1-best (greedy) decoding, beam search decoding generally gives better translations and is widely used in MT.
Read the lecture on decoding and section 13.5.4 of Koehn13, complete the beam search decoding function in NMT. In the python file, the `BeamNode`

class is used to represent each node in the beam search decoding. It has a set of values for beam search, such as the probability of the current hypothesis, the previous node, and the word of this node. Here are two additional videos for beam search from Andrew Ng: Video 1, Video 2.

Here is a summary of the beam search decoding process:

- start from the
symbol, - one step computation in the decoder, get the topk outputs, k=beam_size
- for each hypothesis/candidate, get the new topk outputs
- select topk candidates from the topk*beam_size candidates
- if you got a
symbol, you get one completed translation, otherwise redo step 3 and 4 - stop the decoding when you get beam_size translations or the length of hypothesis is longer than MAX_LENGTH
- length normalization over the generated translations
- based on the last nodes with the highest score in the decoding paths, back trace all the preceeding nodes and get the final translation

You need to implement the code of the `beam_decoding()`

function. In addtion, you also need to write the comments for important codes.

If implementing the code is too hard for you. You have the second choice: write the pseudo-code and draw the decoding steps of an example. You can choose the example by yourself, but the beam size is 3 and the three target predictions' length is 3, 4, and 5, resepectively. You need to draw the process from step 1 to the end, and write a short decription for each step. You can draw the steps on a paper, take a photo and put it into the report.

- Do we use beam search decoding during training? And why?
- Is beam search decoding algorithm guranteed to find the optimal translation? Why?
- Please describe the beam search decoding in SMT and write the differences with beam search decoding in NMT.

You should hand in these files in studentportalen. The deadline for this lab is October 14.