About
SentiVec is a method for distributional analysis, which is used to produce vector representations for words (also known as word embeddings). The method extends and outperforms popular Word2Vec by optimizing an auxiliary lexical objective. The lexical objective takes advantage of word classes, such as positive and negative opinion words, and enforces word embeddings of the same class to be closer to each other.
Try out SentiVec Demo.
Download Pre-trained Word Embeddings
- Amazon Embeddings (1.2Gb, 300D, Amazon Product Data infused with Opinion Lexicon)
- Wikipedia Embeddings (1.3Gb, 300D, Wikipedia Data infused with Opinion Lexicon)
Binary Classification Task | Amazon
Accuracy |
Wikipedia
Accuracy |
---|---|---|
Amazon Sentiment | 86.4% | 82.6% |
Rotten Tomatoes Sentiment | 77.9% | 75.8% |
Objective-Subjective | 90.8% | 90.6% |
News Topics | 83.1% | 83.5% |
Source Code
- Download SentiVec-Logistic (Apache License, Version 2.0)
Cite
Maksim Tkachenko, Chong Cher Chia, Hady W. Lauw, Searching for the X-Factor: Exploring Corpus Subjectivity for Word Embeddings, ACL, 2018.