Workers healthcare gained a lot of attention recently as many countries are increasingly concerning about welfare. This paper faces the problem of predicting occupational disease risks by means of computational intelligence and pattern recognition techniques. Specifically, three different machine learning approaches are compared: the first one is based on the k-means algorithm, in charge to determine a set of meaningful labelled clusters as the final model. The latter two are based on fully supervised techniques, namely Support Vector Machines and K-Nearest Neighbours. Real data regarding both the worker and the workplace by mixing numerical and categorical attributes have been used for testing. The three approaches are automatically tuned by means of genetic algorithms in order to simultaneously find the optimal hyperparameters for the classification systems and the optimal ad-hoc dissimilarity measure weights in order to maximize the classification performances. Computational results show that the three approaches are rather comparable in terms of performances, but a clustering-based approach allows a deeper knowledge discovery phase, helpful for further risk assessment and forecasting.

Supervised machine learning techniques and genetic optimization for occupational diseases risk prediction / Di Noia, Antonio.; Martino, Alessio; Montanari, Paolo.; Rizzi, Antonello. - In: SOFT COMPUTING. - ISSN 1432-7643. - 24:6(2020), pp. 4393-4406. [10.1007/s00500-019-04200-2]

Supervised machine learning techniques and genetic optimization for occupational diseases risk prediction

Martino Alessio.
;
2020

Abstract

Workers healthcare gained a lot of attention recently as many countries are increasingly concerning about welfare. This paper faces the problem of predicting occupational disease risks by means of computational intelligence and pattern recognition techniques. Specifically, three different machine learning approaches are compared: the first one is based on the k-means algorithm, in charge to determine a set of meaningful labelled clusters as the final model. The latter two are based on fully supervised techniques, namely Support Vector Machines and K-Nearest Neighbours. Real data regarding both the worker and the workplace by mixing numerical and categorical attributes have been used for testing. The three approaches are automatically tuned by means of genetic algorithms in order to simultaneously find the optimal hyperparameters for the classification systems and the optimal ad-hoc dissimilarity measure weights in order to maximize the classification performances. Computational results show that the three approaches are rather comparable in terms of performances, but a clustering-based approach allows a deeper knowledge discovery phase, helpful for further risk assessment and forecasting.
2020
cluster analysis
computational intelligence
occupational diseases risk prediction
pattern recognition
predictive medicine
support vector machine
Supervised machine learning techniques and genetic optimization for occupational diseases risk prediction / Di Noia, Antonio.; Martino, Alessio; Montanari, Paolo.; Rizzi, Antonello. - In: SOFT COMPUTING. - ISSN 1432-7643. - 24:6(2020), pp. 4393-4406. [10.1007/s00500-019-04200-2]
File in questo prodotto:
File Dimensione Formato  
Di Noia_Supervised-machine_2020.pdf

Solo gestori archivio

Tipologia: Versione dell'editore
Licenza: DRM (Digital rights management) non definiti
Dimensione 773.63 kB
Formato Adobe PDF
773.63 kB Adobe PDF   Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11385/214541
Citazioni
  • Scopus 35
  • ???jsp.display-item.citation.isi??? 17
social impact