A. Nentidis, A. Krithara, G. Tsoumakas, G. Paliouras (2019) Beyond MeSH: Fine-Grained Semantic Indexing of Biomedical Literature based on Weak Supervision, 32th IEEE CBMS International Symposium on Computer-Based Medical Systems
Biomedical literature in MEDLINE/PubMed is semantically indexed with MeSH thesaurus entries (subject annotations) which may correspond to more than one related but distinct domain concepts. In such cases, the subject annotations do not follow the level of detail available in the domain and do not always suffice to meet the information needs of domain experts. In this work we propose a method to automatically refine subject annotations at the level of concepts and employ it in the case of the MeSH descriptor for Alzheimer’s Disease, which corresponds to six different concepts representing disease subtypes. The results indicate that the use of concept-occurrence as weak supervision can improve upon the predictive performance of literal string matching alone. The refined annotations can support more precise concept-based search, enable integration of subject annotations with other semantic information and facilitate the maintenance of subject annotation consistency, as the MeSH thesaurus evolves with the addition of more detailed entries.