In this assignment session you will have a better understanding of neural language modeling by completing code, checking generations, exploring the ability of neural language models to generate text.
The assignment work can be performed individually or in pairs. It is not necessary to work in the same pairs during all assignments. Note that both students in the pair should be able to independently discuss your findings! Take notes when performing the assignment. During the examination session, students will get a chance to talk about their experiments, what they did, what surprised them, what they learned, et.c.
This assignment is examined in class, on September 17, 15:00(sharp) -16:00. Before this you have to solve the tasks described in this document on your own. You can get help through the discussion forum in Studentportalen. Note that you have to attend the full examination session and be active in order to pass the assignment. If you miss this session, you will instead have to compensate for this, see the bottom of this page.
cp -rf /local/kurs/mt/assignment3/* .
Since most of you work remotely, you can check the avaliable computers from here. Say "prefix" in Chomsky is free, then you can log in via ssh
ssh -Y email@example.com .
python3 -m venv ~/envNMT
pip install -U pip
pip install numpy
pip install torch==1.4.0+cpu torchvision==0.5.0+cpu -f https://download.pytorch.org/whl/torch_stable.html
If you want to exit the environment, just run
The python files are for language modeling and I have removed some code in
model.py. In this task, you need to complete the "TODO" code in the python files. When you finish the code complement correctly, you should not get any exceptions when you run
python nlm.py --dry-run
We are using PyTorch, please refer to the offical docs.
forward() function in
The forward() function is the step of forward computation in the neural networks. Given the input and the hidden states, return softmax results and the hidden states. Hints: the information flow: rnn -> dropout -> output layer -> softmax
train() function in
The train() function contains all the steps of training a neural language model. we train models in the mini-batch style, i.e., update the model every batch. What you need to do is write the code in the iteration. Given the data, we need to compute the gradients to update the model. There are three general steps of training a neural network: 1) forward computation 2) compute the loss 3) backpropagation -> gradients -> update the model (parameters)
In this task, you will use some trained language models to generate text. Check the generation and talk about your expression on the text. What surprised you and what do you think that cause errors.
Here you will use a language model trained with the code in the first task to geneate a text. Look at the training data and settings for the language model, and give your hypotheses on what causes the errors/bugs. Then think about how to improve the language model.
You can run
python generate.py --seed 100 to generate text. The generated text (
generated.txt) is in the current directory. Sentences are separaed by < eos >. Note that you should replace "100" with a different number, then you will get a different text, because the seed is set for random numbers which initialize the generation differently.
Currently the topic on pre-trained language models is very popular, more and more pre-trained language models have been proposed, such as ELMo, BERT, GPT-2, XLNet, RoBERTa, etc. You can find more language models and the relations between them from here. These models generally are based on RNNs or Transformers. (Transformer is a more advanced model than RNNs in NLP, we will learn it later. ) Choose one of the models and figure out how the model works by reading the paper or blogs from the web.
Here we explore the LMs' ability to generate texts. Write with transformer provides some LMs to generate sentences. Test the following models and conclude at least 2 main weaknesses (bugs) of these models.
|domain: NLP papers
Towards the end of the assignment session, everyone will be expected to share their findings and discuss the assignment, in the full class. You should all be prepared to report your main findings, and discuss the questions asked, and any other interesting issues that came up during the assignment.
If you failed to attend the oral session, you instead have to do the assignment on your own, and report it afterwards. You can then either dicuss the assignment during one of the project supervision sessions (given that the teacher has time), or write a report where you discuss your findings (around 1-2 A4-pages). The deadline for the compensation is October 23.