We make publicly and freely available the first sentiment lexicon for Greek. The need for such a lexicon stemmed from our experimentation with both supervised and unsupervised (i.e. lexicon-based) sentiment analysis methods and our conclusion that combining both approaches can considerably improve the sentiment detection accuracy (see Tsakalidis et al., 2014).
To this end, we devised a semi-automatic approach for constructing the lexicon. The approach is described in D2.3 Social Stream Mining Framework (subsection 5.4.1, p.63). In short, we first searched the electronic version of the Triantafyllides lexicon to collect a set of terms that can be used in an ironic, meiotic, abusive, mocking or vulgar tone. We also added terms that contained emotional words (e.g. feel, love, etc.) in their description. We then proceeded with the term annotation: four researchers (two computer scientistcs, and two computational linguists) carried out the annotation along eight dimensions: subjectivity, polarity and the six emotions (happiness, sadness, anger, fear, disgust, surprise).
The annotated lexicon is available on GitHub.
Acknowledgements: We are grateful to researchers Ourania Voskaki and Kyriaki Ioannidou, both working with the Centre for the Greek Language, for their contribution to the annotation process and for their valuable insights and comments during the lexicon construction.
Adam Tsakalidis, Symeon Papadopoulos, Ioannis Kompatsiaris. An Ensemble Model for Cross-Domain Polarity Classification on Twitter. Proceedings of 15th International Conference on Web Information Systems Engineering (WISE 2014), LNCS, Part II, pp. 168-177, Thessaloniki, Greece, October 12-14, 2014, Springer