SentiVec is a method for distributional analysis, which is used to produce vector representations for words (also known as word embeddings). The method extends and outperforms popular Word2Vec by optimizing an auxiliary lexical objective. The lexical objective takes advantage of word classes, such as positive and negative opinion words, and enforces word embeddings of the same class to be closer to each other.
Download Pre-trained Word Embeddings
- Amazon Embeddings (1.2Gb, 300D, Amazon Product Data infused with Opinion Lexicon)
- Wikipedia Embeddings (1.3Gb, 300D, Wikipedia Data infused with Opinion Lexicon)
|Binary Classification Task||Amazon
|Rotten Tomatoes Sentiment||77.9%||75.8%|
- Download SentiVec-Logistic (Apache License, Version 2.0)
Maksim Tkachenko, Chong Cher Chia, Hady W. Lauw, Searching for the X-Factor: Exploring Corpus Subjectivity for Word Embeddings, ACL, 2018.