Proceedings of the Workshop on Computation: Theory and Practice (WCTP 2023)

A Supervised Co-complex Probability Weighting of Yeast Composite Protein Networks using Gradient-boosted Trees for Protein Complex Detection

Authors
Anthony Van C. Cayetano1, *, John Justine S. Villar1
1Scientific Computing Laboratory, Department of Computer Science, University of the Philippines, 1101, Diliman, Quezon City, Philippines
*Corresponding author. Email: accayetano4@up.edu.ph
Corresponding Author
Anthony Van C. Cayetano
Available Online 29 February 2024.
DOI
10.2991/978-94-6463-388-7_21How to use a DOI?
Keywords
protein complex prediction; graph clustering; PPIN
Abstract

Many studies in the past have proposed various methods to detect protein complexes from protein-protein interaction networks (PPINs) by applying clustering algorithms to the network, relying only on the topology of the PPIN. However, PPINs have a high number of false positives and false negatives, making them unreliable when used alone to detect protein complexes. Moreover, not all proteins in a protein complex interact with each other and not all proteins that interact with each other are from the same complex. Thus, relying alone on the physical interactions of proteins is not ideal for detecting protein complexes. This study extends the idea of a method by Yong et al. called SWC, where they integrated other heterogeneous data sources into the PPIN to create a composite network and where each edge is weighted according to its posterior co-complex probability. SWC, when combined with various clustering algorithms, resulted in more accurate results in detecting protein complexes. This study attempts to improve SWC by integrating additional data sources and by using a more advanced machine learning model called gradient-boosted trees. The proposed method outperformed SWC in every performance metric, often by a considerable margin in terms of precision-recall AUC, Brier score loss, and log loss when predicting cocomplex edges. More importantly, it also outperformed SWC in terms of precision-recall AUC when used together with the Markov Cluster algorithm (MCL) to detect protein complexes. Lastly, it also outperformed various unsupervised weighting methods in all the said performance evaluations. These evaluations were performed on two yeast PPINs.

Copyright
© 2024 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Download article (PDF)

Volume Title
Proceedings of the Workshop on Computation: Theory and Practice (WCTP 2023)
Series
Atlantis Highlights in Computer Sciences
Publication Date
29 February 2024
ISBN
10.2991/978-94-6463-388-7_21
ISSN
2589-4900
DOI
10.2991/978-94-6463-388-7_21How to use a DOI?
Copyright
© 2024 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Cite this article

TY  - CONF
AU  - Anthony Van C. Cayetano
AU  - John Justine S. Villar
PY  - 2024
DA  - 2024/02/29
TI  - A Supervised Co-complex Probability Weighting of Yeast Composite Protein Networks using Gradient-boosted Trees for Protein Complex Detection
BT  - Proceedings of the Workshop on Computation: Theory and Practice (WCTP 2023)
PB  - Atlantis Press
SP  - 342
EP  - 365
SN  - 2589-4900
UR  - https://doi.org/10.2991/978-94-6463-388-7_21
DO  - 10.2991/978-94-6463-388-7_21
ID  - Cayetano2024
ER  -