Ali Basirat

Principal Word Vectors

Principal word vectors refers to a set of word vectors (embeddings) which are generated from the application of a principl component analysis (PCA) on a matrix of contextual word vectors. A contextual word vector associated with a word is a real-valued vector whose elements are the frequency of seeing the word in different contexts formed in a corpus. A matrix of contextual word vectors associated with a set of words of a language is called a contextual matrix.

Principal word embedding refers to the method that generates principal word vectors. An implemention of principal word embedding is available here.

  1. Ali Basirat and Joakim Nivre, Real-valued syntactic word vectors, Journal of Experimental & Theoretical Artificial Intelligence, DOI: 10.1080/0952813X.2019.1653385 (2019) SCImago Journal & Country Rank

  2. Ali Basirat, A Generalized Principal Component Analysis for Word Embedding, The Seventh Swedish Language Technology Conference (SLTC), Stockholm (2018)

  3. Ali Basirat, Word Embedding through PCA, The Second Swedish Symposium on Deep Learning (SSDL), Gothenburg (2018)

  4. Ali Basirat, Principal Word Vectors, PhD Thesis, Uppsala University (2018)


An implementation of principal word embeddings is here