10.25673/122070">


Proceedings of International Conference on Applied Innovation in IT
2025/08/29, Volume 13, Issue 4, pp.91-100

Integrating Feature Selection and Machine Learning Boosting for Accurate Breast Cancer Prediction


Wisal Hashim Abdulsalam, Ruma Kareem K. Ajeena and Mohammed Ayad Saad


Abstract: Breast cancer is a prevalent and devastating disease and remains a major contributor to cancer-related mortality among women worldwide. The increasing incidence and fatality rates are often associated with changes in lifestyle and the influence of environmental factors. In response to these alarming trends, the development and deployment of automated breast cancer diagnostic systems have become increasingly important in modern healthcare. This study investigates the performance of several boosting algorithms - CatBoost, LightGBM, XGBoost, AdaBoost, and Gradient Boosting - for breast cancer prediction using the Wisconsin Diagnostic Breast Cancer (WDBC) dataset. The dataset is publicly available on Kaggle and consists of 569 instances, including 357 benign and 212 malignant cases. The proposed framework encompasses data preprocessing, feature selection, and classification stages. Model performance was evaluated using multiple metrics to ensure robust analysis and objective assessment. The experimental results demonstrate that LightGBM outperformed the other models, highlighting the effectiveness of boosting-based approaches for breast cancer diagnosis and emphasizing the potential of these techniques for further advancements in oncology research.

Keywords: Breast Cancer, CatBoost, XGBoost, LightGBM, AdaBoost, Gradiant Boosting.

DOI: 10.25673/122070

Download: PDF

References:

  1. S. K. Mishra, B. Appasani, A. Pati, et al., “Category boosting machine learning algorithm for breast cancer prediction,” Revue Roumaine des Sciences Techniques—Série Électrotechnique et Énergétique, vol. 66, pp. 201–206, 2021.
  2. M. Kumar, S. Singhal, S. Shekhar, B. Sharma, and G. Srivastava, “Optimized stacking ensemble learning model for breast cancer detection and classification using machine learning,” Sustainability, vol. 14, p. 13998, 2022, doi: 10.3390/su142113998.
  3. S. Zhou, C. Hu, S. Wei, and X. Yan, “Breast cancer prediction based on multiple machine learning algorithms,” Technology in Cancer Research & Treatment, vol. 23, p. 15330338241234791, 2024, doi: 10.1177/15330338241234791.
  4. J. Zhu, Z. Zhao, B. Yin, C. Wu, C. Yin, R. Chen, et al., “An integrated approach of feature selection and machine learning for early detection of breast cancer,” Scientific Reports, vol. 15, p. 13015, 2025, doi: 10.1038/s41598-025-97685-x.
  5. H. Sung, J. Ferlay, R. L. Siegel, M. Laversanne, I. Soerjomataram, A. Jemal, et al., “Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries,” CA: A Cancer Journal for Clinicians, vol. 71, pp. 209–249, 2021, doi: 10.3322/caac.21660.
  6. F. Bray, M. Laversanne, H. Sung, J. Ferlay, R. L. Siegel, I. Soerjomataram, et al., “Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries,” CA: A Cancer Journal for Clinicians, vol. 74, pp. 229–263, 2024, doi: 10.3322/caac.21834.
  7. R. Rabiei, S. M. Ayyoubzadeh, S. Sohrabei, M. Esmaeili, and A. Atashi, “Prediction of breast cancer using machine learning approaches,” Journal of Biomedical Physics & Engineering, vol. 12, p. 297, 2022, doi: 10.31661/jbpe.v0i0.2109-1403.
  8. A. A. Jasim, A. A. Jalal, N. M. Abdulateef, and N. A. Talib, “Effectiveness evaluation of machine learning algorithms for breast cancer prediction,” Bulletin of Electrical Engineering and Informatics, vol. 11, pp. 1516–1525, 2022, doi: 10.11591/eei.v11i3.3621.
  9. V. Nemade and V. Fegade, “Machine learning techniques for breast cancer prediction,” Procedia Computer Science, vol. 218, pp. 1314–1320, 2023, doi: 10.1016/j.procs.2023.01.110.
  10. D. Kumari, M. V. S. S. Naidu, S. Panda, and J. Christopher, “Predicting breast cancer recurrence using deep learning,” Discover Applied Sciences, vol. 7, p. 113, 2025, doi: 10.1007/s42452-025-06512-5.
  11. A. J. Ferreira and M. A. Figueiredo, “Boosting algorithms: A review of methods, theory, and applications,” in Ensemble Machine Learning: Methods and Applications, pp. 35–85, 2012, doi: 10.1007/978-1-4419-9326-7_2.
  12. S. H. Jadoaa, R. H. Ali, W. H. Abdulsalam, and E. M. Alsaedi, “The impact of feature importance on spoofing attack detection in IoT environment,” Mesopotamian Journal of CyberSecurity, vol. 5, pp. 240–255, 2025, doi: 10.58496/MJCS/2025/016.
  13. S. S. Abinayaa, P. Arumugam, D. B. Mohan, A. Rajendran, A. Lashab, B. Wei, et al., “Securing the edge: CatBoost classifier optimized by the lyrebird algorithm to detect denial of service attacks in internet of things-based wireless sensor networks,” Future Internet, vol. 16, p. 381, 2024, doi: 10.3390/fi16100381.
  14. R. D. Abdu-Aljabara, K. D. Aljafaara, Z. J. M. Ameenb, and H. A. Namanc, “A comparative study of breast cancer detection and recurrence prediction using CatBoost classifier,” Acta Polytechnica, vol. 65, pp. 136–142, 2025, doi: 10.14311/AP.2025.65.0136.
  15. P. Sarker, A. Ksibi, M. M. Jamjoom, K. Choi, A. A. Nahid, and M. A. Samad, “Breast cancer prediction with feature-selected XGB classifier optimized by metaheuristic algorithms,” Journal of Big Data, vol. 12, p. 78, 2025, doi: 10.1186/s40537-025-01132-7.
  16. M. A. Kashmoola, M. K. Ahmed, and N. Y. A. Alsaleem, “Network traffic prediction based on boosting learning,” Iraqi Journal of Science, vol. 63, pp. 4047–4056, 2022, doi: 10.24996/ijs.2022.63.9.33.
  17. D. Wang, Y. Zhang, and Y. Zhao, “LightGBM: An effective miRNA classification method in breast cancer patients,” in Proc. 2017 Int. Conf. Computational Biology and Bioinformatics, pp. 7–11, 2017, doi: 10.1145/3155077.3155079.
  18. T. O. Omotehinwa, D. O. Oyewola, and E. G. Dada, “A light gradient-boosting machine algorithm with tree-structured Parzen estimator for breast cancer diagnosis,” Healthcare Analytics, vol. 4, p. 100218, 2023, doi: 10.1016/j.health.2023.100218.
  19. R. H. Ali and W. H. Abdulsalam, “Attention-deficit hyperactivity disorder prediction by artificial intelligence techniques,” Iraqi Journal of Science, 2024, doi: 10.24996/ijs.2024.65.9.39.
  20. C. Colak, F. H. Yagin, A. Algarni, A. Algarni, F. Al-Hashem, and L. P. Ardigò, “Proposed comprehensive methodology integrated with explainable artificial intelligence for prediction of possible biomarkers in metabolomics panel of plasma samples for breast cancer detection,” Medicina, vol. 61, p. 581, 2025, doi: 10.3390/medicina61040581.
  21. C. Bentéjac, A. Csörgő, and G. Martínez-Muñoz, “A comparative analysis of gradient boosting algorithms,” Artificial Intelligence Review, vol. 54, pp. 1937–1967, 2021, doi: 10.1007/s10462-020-09896-5.
  22. M. W. Falah, S. H. Hussein, M. A. Saad, Z. H. Ali, T. H. Tran, R. M. Ghoniem, et al., “Compressive strength prediction using coupled deep learning model with extreme gradient boosting algorithm: Environmentally friendly concrete incorporating recycled aggregate,” Complexity, vol. 2022, p. 5433474, 2022, doi: 10.1155/2022/5433474.
  23. K. I. Chibueze, L. I. Ezigbo, and A. Kwubeghari, “Breast cancer prediction with gradient boosting classifiers,” Academy Journal of Science and Engineering, vol. 18, pp. 219–238, 2024.
  24. W. H. Abdulsalam, S. Mashhadani, S. S. Hussein, and A. A. Hashim, “Artificial intelligence techniques to identify individuals through palm image recognition,” Computer Science, vol. 20, pp. 165–171, 2025, doi: 10.69793/ijmcs/01.2025/abdulsalam.
  25. M. Ban Hassan, W. H. Abdulsalam, Z. Hazim Ibrahim, R. H. Ali, and S. Mashhadani, “Digital intelligence for university students using artificial intelligence techniques,” International Journal of Computing and Digital Systems, vol. 17, pp. 1–10, 2025, doi: 10.12785/ijcds/1571029446.
  26. H. Bakhshayesh, S. P. Fitzgibbon, A. S. Janani, T. S. Grummett, and K. J. Pope, “Detecting synchrony in EEG: A comparative study of functional connectivity measures,” Computers in Biology and Medicine, vol. 105, pp. 1–15, 2019, doi: 10.1016/j.compbiomed.2018.12.005.
  27. R. K. Ajeena and S. K. Kamal, “Connecting on the lattice based reductions for computing the generators in the ISD method,” Journal of Physics: Conference Series, p. 012060, 2018, doi: 10.1088/1742-6596/1003/1/012060.
  28. W. H. Abdulsalam, R. H. Ali, S. H. Jadooa, and S. S. Hussein, “Automated glaucoma detection techniques: A literature review,” Engineering, Technology & Applied Science Research, vol. 15, pp. 19891–19897, 2025, doi: 10.48084/etasr.9316.


    HOME

       - Conference
       - Journal
       - Paper Submission to Journal
       - Paper Submission to Conference
       - For Authors
       - For Reviewers
       - Important Dates
       - Conference Committee
       - Editorial Board
       - Reviewers
       - Last Proceedings


    PROCEEDINGS

       - Volume 13, Issue 4 (ICAIIT 2025)
       - Volume 13, Issue 3 (ICAIIT 2025)
       - Volume 13, Issue 2 (ICAIIT 2025)
       - Volume 13, Issue 1 (ICAIIT 2025)
       - Volume 12, Issue 2 (ICAIIT 2024)
       - Volume 12, Issue 1 (ICAIIT 2024)
       - Volume 11, Issue 2 (ICAIIT 2023)
       - Volume 11, Issue 1 (ICAIIT 2023)
       - Volume 10, Issue 1 (ICAIIT 2022)
       - Volume 9, Issue 1 (ICAIIT 2021)
       - Volume 8, Issue 1 (ICAIIT 2020)
       - Volume 7, Issue 1 (ICAIIT 2019)
       - Volume 7, Issue 2 (ICAIIT 2019)
       - Volume 6, Issue 1 (ICAIIT 2018)
       - Volume 5, Issue 1 (ICAIIT 2017)
       - Volume 4, Issue 1 (ICAIIT 2016)
       - Volume 3, Issue 1 (ICAIIT 2015)
       - Volume 2, Issue 1 (ICAIIT 2014)
       - Volume 1, Issue 1 (ICAIIT 2013)


    PAST CONFERENCES

       ICAIIT 2025
         - Photos
         - Reports

       ICAIIT 2024
         - Photos
         - Reports

       ICAIIT 2023
         - Photos
         - Reports

       ICAIIT 2021
         - Photos
         - Reports

       ICAIIT 2020
         - Photos
         - Reports

       ICAIIT 2019
         - Photos
         - Reports

       ICAIIT 2018
         - Photos
         - Reports

    ETHICS IN PUBLICATIONS

    ACCOMODATION

    CONTACT US

 

        

         Proceedings of the International Conference on Applied Innovations in IT by Anhalt University of Applied Sciences is licensed under CC BY-SA 4.0


                                                   This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License


           ISSN 2199-8876
           Publisher: Edition Hochschule Anhalt
           Location: Anhalt University of Applied Sciences
           Email: leiterin.hsb@hs-anhalt.de
           Phone: +49 (0) 3496 67 5611
           Address: Building 01 - Red Building, Top floor, Room 425, Bernburger Str. 55, D-06366 Köthen, Germany

        site traffic counter

Creative Commons License
Except where otherwise noted, all works and proceedings on this site is licensed under Creative Commons Attribution-ShareAlike 4.0 International License.