In this lab, you will have a better understanding of attention mechanisms in NMT by implementing the codes. This lab is the extension of Assignment 4 where you have implemented a seq2seq model without attention mechanisms.
cp -rf /local/kurs/mt/lab/* .
conda activate mt21
Compared to the seq2seq model without attention mechanism in Assignment 4, here are some main differences:
AttnDecoderRNN in the file.
The code already provides
AttentionDot class which uses dot product to compute the scores.
You need to implement the TODO codes (TODO 1-6) in the python file. You can run
to test your code when you finish the implementation.
In addtion, you also need to write the comments for important codes to show me that you have understood the model and know well the meaning of each line. PS: writing comments is always a good habit for progamming, it will make the code easier to read for others and yourself (as you might forgot the code when you check it later).
In this section, you need to implement the conventional attention mechanism with other two different computation methods, and the multi-head attention with scaled dot-product computation method. The illustraions and equations of these functions are given on Slides 10-13 of Lecture 7: advanced NMT.
You can follow the provided code in
AttentionDot class to implement these three classes, i.e.,
Here is the list of code that you need to implement:
You can test each class by passing "attention_type" parameter to the code, and here are the specific scripts for testing each class:
python seq2seq_attention.py --attn_type general
python seq2seq_attention.py --attn_type concat
python seq2seq_attention.py --attn_type multihead --attention_head 2 --initial_learning_rate 0.0001
nohup command to train the model in the backgroud, i.e., you do not need to wait unitl it finishes. You can check the output from the log file. Here is an example:
nohup python seq2seq_attention.py --attn_type concat >log.train.concat 2>&1 &
You should hand in these files in studium. The last deadline for this lab is October 29.