AUTH @ CLSciSumm 20, LaySumm 20, LongSumm 20

A.Gidiotis, S. Stefanidis, G. Tsoumakas (2020) AUTH @ CLSciSumm 20, LaySumm 20, LongSumm 20, 2020 Conference on Empirical Methods in Natural Language Processing

Text summarization, Text classification, Natural language processing, Deep learning

We present the systems we submitted for the shared tasks of the Workshop on Scholarly Document Processing at EMNLP 2020. Our approaches to the tasks are focused on exploiting large Transformer models pre-trained on huge corpora and adapting them to the different shared tasks. For tasks 1A and 1B of CL-SciSumm we are using different variants of the BERT model to tackle the tasks of “cited text span” and “facet” identification. For the summarization tasks 2 of CL-SciSumm, LaySumm and LongSumm we make use of different variants of the PEGASUS model, with and without fine-tuning, adapted to the nuances of each one of those particular tasks.