G. Tzanis, C. Berberidis, I. Vlahavas, “MANTIS: A Data Mining Methodology for Effective Translation Initiation Site Prediction”, Proceedings of the 29th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, IEEE, Lyon, France, 2007.
The prediction of the translation initiation site in a genomic sequence with the highest possible accuracy is an important problem that still has to be investigated by the research community. Current approaches perform quite well, however there is still room for a more general framework for the researchers who want to follow an effective and reliable methodology. We developed a prediction methodology that combines ad hoc as well as discovered knowledge in order to significantly increase the achieved accuracy reliably. Our methodology is modular and consists of three major decision components: a consensus component, a coding region classification component and a novel ATG location-based component that allows for the utilization of the advantages of the popular Ribosome Scanning Model while overcoming its limitations. All three of them are combined into a meta-classification system, using stacked generalization, in a highly effective prediction framework. We performed extensive comparative experiments on four different datasets, showing that the increase in terms of accuracy and adjusted accuracy is not only statistically significant, but also the highest reported.