Proceedings of International Conference on Applied Innovation in IT  ·  2025/07/26  ·  Vol. 13  ·  Issue 3  ·  pp. 253–262
Integrated Features Based on Graph Clustering and Gene Expression
Sura Ibrahim Mohammed Ali and Sura Zaki Al Rashid
Integrating different biological features – for example, the informativeness of topological features and gene expression – is challenging because each feature must be accounted for individually if the features are used to help forecast models. In this process, ensuring that the outcomes reflect the underlying biological structure of the network information while minimizing noise and irrelevant data is crucial. This study identifies the importance of rigorous pre-analyses in determining statistically significant correlations and joint effects among preprocess features before applying machine-learning techniques. Thus, when deploying multidimensional datasets, a systematic multi-feature methodology is presented in this paper to unify optimized graph clustering, weighted Jaccard similarity, and dimension reduction based on principal component analysis (PCA). Specifically, the objective was to identify novel uncharacterized gene associations in complex biological networks. Moreover, this study offers more refined insights into gene interactions within their networks, revealing patterns and relationships that might be hidden by broad data analysis. The method's performance was validated according to the benchmarks for a Dialogue on Reverse Engineering Assessment of Methods, fifth edition (DREAM5) challenge project, to determine its ability to analyze complex biological networks.
Topological Analysis Multi-Feature Framework Gne Expression Graph Clustering Gene Regulatory NetWork.
References
  1. D. Marbach et al., "Perturbations Across Complex Diseases," Genome Biology, vol. 18, no. 1, p. 236, 2016, [Online]. Available: https://doi.org/10.1038/nmeth.3799.
  2. M. J. Bonder, R. Luijk, D. V. Zhernakova, and M. Moed, "Disease variants alter transcription factor levels and methylation of their binding sites," 2015.
  3. S. S. Ahmed, S. Roy, and J. Kalita, "Assessing the Effectiveness of Causality Inference Methods for Gene Regulatory Networks," IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 17, no. 1, pp. 56–70, 2020, [Online]. Available: https://doi.org/10.1109/TCBB.2018.2853728.
  4. B. A. Logsdon and J. Mezey, "Gene expression network reconstruction by convex feature selection when incorporating genetic perturbations," PLoS Computational Biology, vol. 6, no. 12, 2010, [Online]. Available: https://doi.org/10.1371/journal.pcbi.1001014.
  5. S. Z. AlRashid, M. H. Dosh, and A. J. Obaid, "Classification of the Senescence-Accelerated Mouse (SAM) Strains With Its Behaviour Using Deep Learning," International Journal of e-Collaboration, vol. 18, no. 2, pp. 1–13, 2022, [Online]. Available: https://doi.org/10.4018/IJeC.304035.
  6. F. Wagner, "GO-PCA: An unsupervised method to explore gene expression data using prior knowledge," PLoS One, vol. 10, no. 11, pp. 1–26, 2015, [Online]. Available: https://doi.org/10.1371/journal.pone.0143196.
  7. J. X. Liu, C. M. Feng, X. Z. Kong, and Y. Xu, "Dual Graph-Laplacian PCA: A Closed-Form Solution for Bi-Clustering to Find 'Checkerboard' Structures on Gene Expression Data," IEEE Access, vol. 7, pp. 151329–151338, 2019, [Online]. Available: https://doi.org/10.1109/ACCESS.2019.2941227.
  8. S. R. Datasets, J. M. Zhang, J. Fan, H. C. Fan, D. Rosenfeld, and D. N. Tse, "An Interpretable Framework for Clustering," bioRxiv, pp. 1–15, 2017.
  9. T. Liu and C. Jia, "scDFN: enhancing single-cell RNA-seq clustering with," vol. 25, no. 6, 2024.
  10. B. Yu et al., "ScGMAI: A Gaussian mixture model for clustering single-cell RNA-Seq data based on deep autoencoder," Briefings in Bioinformatics, vol. 22, no. 4, pp. 1–10, 2021, [Online]. Available: https://doi.org/10.1093/bib/bbaa316.
  11. M. D. Luecken et al., "Benchmarking atlas-level data integration in single-cell genomics," Nature Methods, vol. 19, no. 1, pp. 41–50, 2022, [Online]. Available: https://doi.org/10.1038/s41592-021-01336-8.
  12. N. A. A. Shanan, H. A. Lafta, and S. Z. Al Rashid, "Using alignment-free methods as preprocessing stage to classification whole genomes," International Journal of Nonlinear Analysis and Applications, vol. 12, no. 2, pp. 1531–1539, 2021, [Online]. Available: https://doi.org/10.22075/ijnaa.2021.5281.
  13. S. Fortunato and D. Hric, "Community detection in networks: A user guide," Physics Reports, vol. 659, pp. 1–44, 2016, [Online]. Available: https://doi.org/10.1016/j.physrep.2016.09.002.
  14. S. Noori, N. Al-A'araji, and E. Al-Shamery, "Construction of dynamic protein interaction network based on gene expression data and quartile one principle," Proteins: Structure, Function, and Bioinformatics, vol. 90, no. 5, pp. 1219–1228, 2022, [Online]. Available: https://doi.org/10.1002/prot.26304.
  15. S. M. Hill et al., "Inferring causal molecular networks: Empirical assessment through a community-based effort," Nature Methods, vol. 13, no. 4, pp. 310–322, 2016, [Online]. Available: https://doi.org/10.1038/nmeth.3773.
  16. C. G. Urzúa-Traslaviña et al., "Improving gene function predictions using independent transcriptional components," Nature Communications, vol. 12, no. 1, 2021, [Online]. Available: https://doi.org/10.1038/s41467-021-21671-w.
  17. A. Bhih, P. Johnson, and M. Randles, "An optimisation tool for robust community detection algorithms using content and topology information," The Journal of Supercomputing, vol. 76, no. 1, pp. 226–254, 2020, [Online]. Available: https://doi.org/10.1007/s11227-019-03018-x.
  18. A. Bhih, P. Johnson, T. Nguyen, and M. Randles, "Decentralized iterative community clustering approach (DICCA)," IEEE International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC), vol. 2017-Octob, pp. 1–7, 2017, [Online]. Available: https://doi.org/10.1109/PIMRC.2017.8292677.
  19. N. Alrefaai and S. Z. Alrashid, "Classification of gene expression dataset for type 1 diabetes using machine learning methods," Bulletin of Electrical Engineering and Informatics, vol. 12, no. 5, pp. 2986–2992, 2023, [Online]. Available: https://doi.org/10.11591/eei.v12i5.4322.
  20. I. Korsunsky et al., "Harmony 2," Nature Methods, vol. 16, no. 12, pp. 1289–1296, 2019, [Online]. Available: https://doi.org/10.1038/s41592-019-0619-0.
  21. D. Marbach et al., "Wisdom of crowds for robust gene network inference," Nature Methods, vol. 9, no. 8, pp. 796–804, 2012, [Online]. Available: https://doi.org/10.1038/nmeth.2016.
  22. H. Salgado, A. Santos, U. Garza-Ramos, J. Van Helden, E. Díaz, and J. Collado-Vides, "RegulonDB (version 2.0): A database on transcriptional regulation in Escherichia coli," Nucleic Acids Research, vol. 27, no. 1, pp. 59–60, 1999, [Online]. Available: https://doi.org/10.1093/nar/27.1.59.
  23. G. J. Kang, S. R. Ewing-Nelson, L. Mackey, J. T. Schlitt, A. Marathe, and K. M. Abbas, "Neonatal Rat Ventricular Myocyte Isolation: HHS Public Access," Physiology & Behavior, vol. 176, no. 1, pp. 139–148, 2018. https://doi.org/10.1002/cpbi.43.
  24. S. Choobdar et al., "Assessment of network module identification across complex diseases," Nature Methods, vol. 16, no. 9, pp. 843–852, 2019, [Online]. Available: https://doi.org/10.1038/s41592-019-0509-5.

Proceedings of the International Conference on Applied Innovations in IT by Anhalt University of Applied Sciences is licensed under CC BY-SA 4.0  ·  This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License

ICAIIT 2026
International Conference on Applied Innovation in IT
Navigation
Publisher
ISSN2199-8876
Location Anhalt University of Applied Sciences
Phone +49 (0) 3496 67 5611
Address Building 01, Room 425
Bernburger Str. 55
D-06366 Köthen, Germany
Open Access License

All works are licensed under the Creative Commons Attribution-ShareAlike 4.0 International License (CC BY-SA 4.0), unless otherwise noted.

Published by ICAIIT in cooperation with Anhalt University of Applied Sciences.

© 2026 ICAIIT — International Conference on Applied Innovations in IT. Anhalt University of Applied Sciences, Köthen, Germany.
Visitors: site traffic counter