Ali Basirat



R&D in Language Technology - Word Embeddings (2020)

Reading List

  1. Deerwester, S., Dumais, S.T., Furnas G.W., Landauer, T.K., and Harshman, R. (1990), Indexing by latent semantic analysis, Journal of the American society for information science, 41(6), 391--407.
  2. Lund, K., and Burgess, C., (1996), Producing high-dimensional semantic spaces from lexical co-occurrence, Behavior research methods, instruments, & computers, 28(2), 203--208.,
  3. Yoshua, B., Ducharme, R., and Pascal, V., (2001), A Neural Probabilistic Language Model, Advances in Neural Information Processing Systems, 13, 932--938
  4. Sebastian Padó, Mirella Lapata (2007), Dependency-Based Construction of Semantic Space Models, Computational Linguistics, 33.2
  5. Sellberg, L., and Jönsson, A., (2008), Using Random Indexing to improve Singular Value Decomposition for Latent Semantic Analysis, Proceedings of the Sixth International Conference on Language Resources and Evaluation
  6. Mikolov, T., Chen, K., and Corrado, G., and Dean, J., (2013), Efficient Estimation of Word Representations in Vector Space, 1st International Conference on Learning Representations (ICLR), [Sem. 1]
  7. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., and Dean, J., (2013), Distributed Representations of Words and Phrases and their Compositionality, Advances in Neural Information Processing Systems, 26, 3111--3119
  8. Mikolov, T., Yih, W., and Zweig, G., (2013), Linguistic Regularities in Continuous Space Word Representation, Proceedings of the 2013 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, 746--751
  9. Andriy Minh, and Koray Kavukcuoglu, (2013), Learning word embeddings efficiently withnoise-contrastive estimation, Advances in neural information processing systems
  10. Lebret, R., and Collobert, R., (2014), Word Embeddings through Hellinger PCA, Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics
  11. Pennington, J., and Socher, R., and Manning, C., (2014), GloVe: Global Vectors for Word Representation, Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), [Sem. 1]
  12. Neelakantan, A., Shankar, J., Passos, A., McCallum, A., (2014), Efficient Non-parametric Estimation of Multiple Embeddings per Word in Vector Space, Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing
  13. Levy, O.,, and Goldberg, Y., (2014), Dependency-Based Word Embeddings, Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics,
  14. Schnabel, T., Labutov, I., Mimno, D., and Joachims, T., (2015), Evaluation methods for unsupervised word embeddings, Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 298--307
  15. Luke V., and Andrew M., (2015), Word Representations via Gaussian Embedding, International Conference of Learning Representation [Sem. 2]
  16. Arora, S., Li, Y., Liang, Y., Ma, T., and Risteski, A., (2016), A Latent Variable Model Approach to PMI-based Word Embeddings, Transactions of the Association for Computational Linguistics, 4, 85--399
  17. Melamud, O., Goldberger, J., and Dagan, I., (2016), Context2vec: Learning Generic Context Embedding with Bidirectional LSTM, Proceedings of The 20th SIGNLL Conference on Computational Natural Language Learning, 51--61, [Sem. 3]
  18. Bojanowski, P., and Grave, E., and Joulin, A., and Mikolov, T., (2017), Enriching Word Vectors with Subword Information, Transactions of the Association for Computational Linguistics, 5, 135--146, [Sem. 1]
  19. Peters, M., Ammar, W., Bhagavatula, C., and Power, R., (2017), Semi-supervised sequence tagging with bidirectional language models, Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 1, 1756--1765
  20. Nguyen, D.Q., Nguyen, D.Q., Modi, A., Thater, S., and Pinkal, M., (2017), A Mixture Model for Learning Multi-Sense Word Embeddings, Proceedings of the 6th Joint Conference on Lexical and Computational Semantics, 121--127, [Sem. 2]
  21. McCann, B. Bradbury, J. Xiong, C. and Socher, R (2017), Learned in Translation: Contextualized Word Vectors, Advances in Neural Information Processing Systems 30 (NIPS 2017), [Sem. 3]
  22. Peters, M., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., and Zettlemoyer, L., (2018), Deep Contextualized Word Representations, Proceedings of the 2018 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, 1, 2227--2237, [Sem. 3]
  23. Brazinskas, A., Havrylov, S., Titov, I., (2018), Embedding Words as Distributions with a Bayesian Skip-gram Model, Proceedings of the 27th International Conference on Computational Linguistics, 1775--1789, [Sem. 2]
  24. Devlin, J., Chang, M., Lee, K., and Toutanova, K., (2019), BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 1, 4171--4186,

Background

  1. Geoffrey E Hinton, J.L. McClelland, D.E. Rumelhart, (1986), Distributed Representations, Parallel Distributed Processing: Explorations in the Microstructure of Cognition: Foundations, 1
  2. David E. Rumelhart and James L. McClelland, (1987), Learning Internal Representations by Error Propagation, Parallel Distributed Processing: Explorations in the Microstructure of Cognition: Foundations
  3. Jeffrey L. Elman, (1990), Finding structure in time, Cognitive Science, 2(14), 179-211
  4. Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze, (2008), Introduction to Information Retrieval, Ch. 18: Matrix decompositions & latent semantic indexing, Cambridge University Press
  5. Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio (2015), Neural Machine Translation by Jointly Learning to Align and Translate In 3rd International Conference on Learning Representations (ICLR 2015)
  6. Thang Luong, Hieu Pham, Christopher D. Manning, (2015), Effective Approaches to Attention-based Neural Machine Translation Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing
  7. Jozefowicz, R., Vinyals, O., Schuster, M., Shazeer, N., and Wu, Y., (2016), Exploring the Limits of Language Modeling, arXiv preprint arXiv:1602.02410
  8. Ankur Parikh, Oscar Täckström, Dipanjan Das, Jakob Uszkoreit (2016), A Decomposable Attention Model for Natural Language Inference, Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing
  9. Zhouhan Lin, Minwei Feng, Cicero Nogueira dos Santos, Mo Yu, Bing Xiang, Bowen Zhou, Yoshua Bengio (2017), A Structured Self-attentive Sentence Embedding, In 5th International Conference on Learning Representations (ICLR 2017)
  10. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I., (2017), Attention is All you Need, Advances in Neural Information Processing Systems 30 (NIPS 2017)
  11. Peter Shaw, Jakob Uszkoreit, Ashish Vaswani (2018), Self-Attention with Relative Position Representations, Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
  12. Dan Jurafsky and James H. Martin., (2019), Speech and Language Processing: 3rd Edition

Further Reading

  1. Maximillian N., and Douwe K., (2017), Poincaré Embeddings for Learning Hierarchical Representations, Advances in Neural Information Processing Systems 30 (NIPS 2017)
  2. Kazuma Hashimoto, Caiming Xiong, Yoshimasa Tsuruoka, Richard Socher (2017), A Joint Many-Task Model: Growing a Neural Network for Multiple NLP Tasks, Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing
  3. Robert Bamler, Stephan Mandt, (2017), Dynamic Word Embeddings, In the proceedings of the International Conference on Machine Learning (ICML 2017)
  4. Sean MacAvaney, Amir Zeldes, (2018), A Deeper Look into Dependency-Based Word Embeddings, Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop
  5. Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I., (2018), Language Models are Unsupervised Multitask Learners, OpenAI Blog , 1(8)
  6. Peters, M., Neumann, M., Zettlemoyer, L., and Yih, W., (2018), Dissecting Contextual Word Embeddings: Architecture and Representation, Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 1499--1509
  7. Ali Basirat, Marc Tang (2019), Linguistic Information in Word Embeddings, In: van den Herik J., Rocha A. (eds) Agents and Artificial Intelligence. ICAART 2018. Lecture Notes in Computer Science, vol 11352. pp 492-513, Springer, Cham
  8. Peter De Bolla, Ewan Jones, Paul Nulty, Gabriel Recchia, and John Reg (2019), Distributional Concept Analysis, In Contributions to the History of Concepts
  9. Ben Athiwaratkun and Andrew Gordon Wilson, (2019), Multimodal Word Distributions, Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
  10. Thompson, B., Roberts, S.G. and Lupyan, G (2019), Cultural influences on word meanings revealed through large-scale semantic alignment, Nat Hum Behav (2020)
  11. Mario Giulianelli, Marco Del Tredici, Raquel Fernández, (2020), Analysing Lexical Semantic Change with Contextualised Word Representations, Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL-2020)
  12. Lewis, M., Lupyan, G., (2020), Gender stereotypes are reflected in the distributional structure of 25 languages, Nat Hum Behav (2020)

Master Theses

  1. Guanchen Song, (2019) Multilingual word embeddings based on cross-lingual annotations, Master thesis, 2019, Uppsala University
  2. Andrew Dyer, (2019), Low supervision, low corpus size, low similarity! Challenges in cross-lingual alignment of word embeddings Master thesis, 2019, Uppsala University
  3. Mario Giulianelli, (2019), Analysing Lexical Semantic Change with Contextualised Word Representations, Master thesis, 2019, University of Amsterdam
  4. Adam Moss, (2020), Detecting Lexical Semantic Change Using Probabilistic Gaussian Word Embeddings Master thesis, 2020, Uppsala University