Tsoumakas, G., Papadopoulos, A., Qian, W., Vologiannidis, S., D’yakonov, A., Puurula, A., Read, J., Švec, J., Semenov, S. (2014) WISE 2014 challenge: Multi-label classification of print media articles to topics, Proceedings of the 2014 Web Information Systems Engineering Conference, pp. 541-548.
Author(s): Tsoumakas, G., Papadopoulos, A., Qian, W., Vologiannidis, S., D’yakonov, A., Puurula, A., Read, J., Švec, J., Semenov, S.
Abstract: The WISE 2014 challenge was concerned with the task of multi-label classification of articles coming from Greek print media. Raw data comes from the scanning of print media, article segmentation, and optical character segmentation, and therefore is quite noisy. Each article is examined by a human annotator and categorized to one or more of the topics being monitored. Topics range from specific persons, products, and companies that can be easily categorized based on keywords, to more general semantic concepts, such as environment or economy. Building multi-label classifiers for the automated annotation of articles into topics can support the work of human annotators by suggesting a list of all topics by order of relevance, or even automate the annotation process for media and/or categories that are easier to predict. This saves valuable time and allows a media monitoring company to expand the portfolio of media being monitored. This paper summarizes the approaches of the top 4 among the 121 teams that participated in the competition.