Nikolaos Mylonas, Stamatis Karlos, and Grigorios Tsoumakas. 2020. Zero- Shot Classification of Biomedical Articles with Emerging MeSH Descriptors. In 11th Hellenic Conference on Artificial Intelligence (SETN 2020), September 2–4, 2020, Athens, Greece. ACM, New York, NY, USA, 10 pages. https://doi. org/10.1145/3411408.3411414
Although numerous applications that have been developed during the last years produce vast amounts of data, the inability to obtain their ground truth target values has triggered the appearance of several new machine learning (ML) variants that tackle such phenomena. The main reasons why this happens are the evolutionary nature that characterizes the majority of real-world problems, highly hindering the conventional approaches to be applied because of incompatibility, as well as the noisy sources of data or even the shortage of available training data to produce robust predictive models. The objective of this work is to provide a new ML approach in the field of zero-shot classification, focused on classifying abstracts that come from PubMed, a well-known resource of publications from the biomedical field. The proposed approach differs in that it uses bioBERT embeddings for transforming the textual data into a new semantic space exploiting them on sentence-level, instead of adopting the usual n-grams solution. Moreover, its asset of constructing a learning model without demanding any collected training data leads to an instance-based approach, while at the same time, it can be used as an internal mechanism for assigning labels to collected unlabeled training data, creating appropriate weakly supervised learning batch-based variants. Our evaluations over 3 different MeSH terms highlights the usefulness of these approaches against a state-of-the-art approach and a well-defined baseline, respectively.