Background: Metabolic networks are complex systems that describe the biochemical reactions within an organism through pairwise interactions between chemical compounds. While this representation is widely used to study biological function, it fails to capture the full structure of metabolic reactions, many of which involve more than two compounds. Hypergraphs offer a more natural representation, where nodes represent metabolites and hyperedges represent reactions involving multiple participants. Clustering such metabolic hypergraphs can reveal systematic differences among evolutionarily distinct organisms, providing insight into ecological constraints and evolutionary pressures. Methods: In this study, we investigate how different graphs and hypergraphs embedding methods influence their unsupervised clustering, with the goal of capturing taxonomy-based classes. We apply 14 distinct embedding strategies to a large-scale dataset of 8467 metabolic hypergraphs. Each embedding was followed by hierarchical clustering using a fixed linkage method. To assess performance, we compared the resulting clusters against known taxonomic groupings. Results: Our findings show that the choice of hypergraph embedding has a significant effect on clustering outcomes. Among the tested methods, Bag of Hyperedges with Jaccard distance, Histogram Cosine Kernel, and a Hypergraph Auto-Encoder consistently performed best. We also advocate that the embedding method should be chosen based on the goal of the downstream task.

Cervellini, Mattia; Sinaimeri, Blerina; Matias, Catherine; Martino, Alessio. (2026). Comparing the ability of embedding methods on metabolic hypergraphs for capturing taxonomy-based features. ALGORITHMS FOR MOLECULAR BIOLOGY, (ISSN: 1748-7188), 21:1, 1-26. Doi: 10.1186/s13015-026-00298-w.

Comparing the ability of embedding methods on metabolic hypergraphs for capturing taxonomy-based features

Sinaimeri, Blerina;Martino, Alessio
2026

Abstract

Background: Metabolic networks are complex systems that describe the biochemical reactions within an organism through pairwise interactions between chemical compounds. While this representation is widely used to study biological function, it fails to capture the full structure of metabolic reactions, many of which involve more than two compounds. Hypergraphs offer a more natural representation, where nodes represent metabolites and hyperedges represent reactions involving multiple participants. Clustering such metabolic hypergraphs can reveal systematic differences among evolutionarily distinct organisms, providing insight into ecological constraints and evolutionary pressures. Methods: In this study, we investigate how different graphs and hypergraphs embedding methods influence their unsupervised clustering, with the goal of capturing taxonomy-based classes. We apply 14 distinct embedding strategies to a large-scale dataset of 8467 metabolic hypergraphs. Each embedding was followed by hierarchical clustering using a fixed linkage method. To assess performance, we compared the resulting clusters against known taxonomic groupings. Results: Our findings show that the choice of hypergraph embedding has a significant effect on clustering outcomes. Among the tested methods, Bag of Hyperedges with Jaccard distance, Histogram Cosine Kernel, and a Hypergraph Auto-Encoder consistently performed best. We also advocate that the embedding method should be chosen based on the goal of the downstream task.
2026
Clustering
Embeddings
Hypergraphs
Kernel methods
Metabolic networks
Neural networks
Taxonomic groups
Cervellini, Mattia; Sinaimeri, Blerina; Matias, Catherine; Martino, Alessio. (2026). Comparing the ability of embedding methods on metabolic hypergraphs for capturing taxonomy-based features. ALGORITHMS FOR MOLECULAR BIOLOGY, (ISSN: 1748-7188), 21:1, 1-26. Doi: 10.1186/s13015-026-00298-w.
File in questo prodotto:
File Dimensione Formato  
s13015-026-00298-w.pdf

Open Access

Tipologia: Versione dell'editore
Licenza: Creative commons
Dimensione 3.17 MB
Formato Adobe PDF
3.17 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11385/263618
Citazioni
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
  • OpenAlex ND
social impact