Fast Dawid-Skene: A Fast Vote Aggregation Scheme for Sentiment Classification

Vaibhav B Sinha, Sukrut Rao, Vineeth N Balasubramanian

Project DOI arXiv Paper Code Slides Web Page

Abstract

Many real world problems can now be effectively solved using supervised machine learning. A major roadblock is often the lack of an adequate quantity of labeled data for training. A possible solution is to assign the task of labeling data to a crowd, and then infer the true label using aggregation methods. A well-known approach for aggregation is the Dawid-Skene (DS) algorithm, which is based on the principle of Expectation-Maximization (EM). We propose a new simple, yet effective, EM-based algorithm, which can be interpreted as a ‘hard’ version of DS, that allows much faster convergence while maintaining similar accuracy in aggregation. We show the use of this algorithm as a quick and effective technique for online, real-time sentiment annotation. Our experiments on standard datasets show a significant speedup in time taken for aggregation - upto ~8x over Dawid-Skene and ~6x over other fast EM methods, at competitive accuracy performance.

Type

Workshop paper

Publication

Workshop on Issues of Sentiment Discovery and Opinion Mining (WISDOM) at the ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) 2018