Dissimilarity space representations and automatic feature selection for protein function prediction

De Santis, Enrico; Martino, Alessio; Rizzi, Antonello; Mascioli, Fabio Massimo Frattale

doi:10.1109/IJCNN.2018.8489115

Dissimilarity spaces, along with feature reduction/ selection techniques, are among the mainstream approaches when dealing with pattern recognition problems in structured (and possibly non-metric) domains. In this work, we aim at investigating dissimilarity space representations in a biology-related application, namely protein function classification, as proteins are a seminal example of structured data given their primary and tertiary structures. Specifically, we propose two different analyses relying on both the complete dissimilarity matrix and a dimensionally-reduced version of the complete dissimilarity matrix, thereby casting the pattern recognition problem from structured domains towards real-valued feature vectors, for which any standard classification algorithm can be used. A third, hybrid, analysis uses a clustering-based one-class classifier exploiting different representations. First results conducted on a subset of the Escherichia coli proteome are promising and some of the analyses presented in this work may also dually suit field-experts, further bridging the gap between natural sciences and computational intelligence techniques.

De Santis, Enrico; Martino, Alessio; Rizzi, Antonello; Mascioli, Fabio Massimo Frattale. (2018). Dissimilarity space representations and automatic feature selection for protein function prediction. In 2018 International Joint Conference on Neural Networks (IJCNN) (pp. 1- 8). Institute of Electrical and Electronics Engineers (IEEE). Isbn: 978-1-5090-6014-6. Doi: 10.1109/IJCNN.2018.8489115. https://ieeexplore.ieee.org/document/8489115.

Dissimilarity space representations and automatic feature selection for protein function prediction

De Santis, Enrico;Martino, Alessio;Rizzi, Antonello;Mascioli, Fabio Massimo Frattale

2018

Abstract

Dissimilarity spaces, along with feature reduction/ selection techniques, are among the mainstream approaches when dealing with pattern recognition problems in structured (and possibly non-metric) domains. In this work, we aim at investigating dissimilarity space representations in a biology-related application, namely protein function classification, as proteins are a seminal example of structured data given their primary and tertiary structures. Specifically, we propose two different analyses relying on both the complete dissimilarity matrix and a dimensionally-reduced version of the complete dissimilarity matrix, thereby casting the pattern recognition problem from structured domains towards real-valued feature vectors, for which any standard classification algorithm can be used. A third, hybrid, analysis uses a clustering-based one-class classifier exploiting different representations. First results conducted on a subset of the Escherichia coli proteome are promising and some of the analyses presented in this work may also dually suit field-experts, further bridging the gap between natural sciences and computational intelligence techniques.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno del convegno
	
				2018
			
	Codice ISBN
	
				978-1-5090-6014-6
			
	Parole chiave
	
				protein function prediction
dissimilarity space representations
automatic feature selection
one class classification
genetic algorithms
support vector machines
			
	Citazione
	
				De Santis, Enrico; Martino, Alessio; Rizzi, Antonello; Mascioli, Fabio Massimo Frattale. (2018). Dissimilarity space representations and automatic feature selection for protein function prediction. In 2018 International Joint Conference on Neural Networks (IJCNN) (pp. 1- 8).  Institute of Electrical and Electronics Engineers (IEEE). Isbn: 978-1-5090-6014-6. Doi: 10.1109/IJCNN.2018.8489115. https://ieeexplore.ieee.org/document/8489115.
			
	Appare nelle tipologie:
	
				04.1 - Contributo in Atti di convegno (Paper in Proceedings)

File in questo prodotto:

File	Dimensione	Formato
DeSantis_Dissimilarity_2018.pdf Solo gestori archivio Tipologia: Versione dell'editore Licenza: DRM (Digital rights management) non definiti Dimensione 1.42 MB Formato Adobe PDF Visualizza/Apri	1.42 MB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11385/214589

Citazioni

19

0

ND

IRIS - Institutional Research Information System