PCOS PREDICTION USING MACHINE LEARNING TECHNIQUES: A COMPARATIVE ANALYSIS OF MODELS AND PRACTICAL APPLICATION
Abstract
Polycystic Ovary Syndrome (PCOS) is an endocrine disorder that affects women of reproductive age, and it is challenging to diagnose due to its clinical heterogeneity and symptom overlap with other conditions. The following study aims to investigate the use of Machine Learning (ML) techniques to enhance the diagnostic accuracy of PCOS, using a public dataset and comparing the classifiers Support Vector Machine (SVM), K-Nearest Neighbors (KNN), Decision Tree (DT), and Random Forest (RF). Additionally, feature selection and data balancing techniques are applied to optimize the models. Finally, SOP ASSIST is proposed, an API that provides the patient's diagnostic result, considering the best trained model.
Author Biographies
Universidade Federal Rural do Semi-Árido (UFERSA).
Universidade Federal Rural do Semi-Árido - UFERSA.
Universidade Federal Rural do Semi-Árido - UFERSA.
Universidade Federal da Paraíba - UFPB.
References
BÜYÜKKEÇECI, M.; OKUR, M. C. A comprehensive review of feature selection and feature selection stability in machine learning. Gazi University Journal of Science, [s. l.], v. 36, n. 4, p. 1506-1520, Dec. 2023. Disponível em: https://doi.org/10.35378/gujs.993763 Acesso em: 23 jan. 2025. DOI: https://doi.org/10.35378/gujs.993763
CHAWLA, N. V.; BOWYER, K. W.; HALL, L. O.; KEGELMEYER, W. P. SMOTE: Synthetic Minority Over-sampling Technique. Journal of Artificial Intelligence Research, [s. l.], v. 16, p. 321–357, June. 2002. Disponível em: https://doi.org/10.1613/jair.953 Acesso em: 20 jan. 2025. DOI: https://doi.org/10.1613/jair.953
CHE, Y.; YU, J.; LI, Y. S.; ZHU, Y.; TAO, T. Polycystic Ovary Syndrome: Challenges and Possible Solutions. Journal of Clinical Medicine, [s. l.], v. 12, n. 4, p. 1500, Feb. 2023. Disponível em: https://pmc.ncbi.nlm.nih.gov/articles/PMC9967025/ Acesso em: 8 fev. 2025. DOI: https://doi.org/10.3390/jcm12041500
GÉRON, A. Mãos à Obra: Aprendizado de Máquina com Scikit-Learn, Keras e TensorFlow. 2. ed. atual. Rio de Janeiro: Alta Books, 2021. 614 p. ISBN 9788550815480.
NASIM, S.; ALMUTAIRI, M. S.; MUNIR, K.; RAZA, A.; YOUNAS, F. A novel approach for polycystic ovary syndrome prediction using machine learning in bioinformatics. IEEE Access, [s. l.], v. 10, p. 97610-97624, 2022. Disponível em: https://ieeexplore.ieee.org/document/9885199 Acesso em: 25 jan. 2025. DOI: https://doi.org/10.1109/ACCESS.2022.3205587
RANGEL, F. R.; LOPES, C. C. A.; REZENDE, M. C. B.; SALES, C. B.; MAGALHÃES, A. C. T. Síndrome dos Ovários Policísticos: Revisão Sistemática da Etiologia, Fisiopatologia, Diagnóstico e Tratamento. Brazilian Journal of Implantology and Health Sciences, [s. l.], v. 6, n. 8, p. 5403–541, ago. 2024. Disponível em: https://doi.org/10.36557/2674-8169.2024v6n8p5403-5412 Acesso em: 8 fev. 2025. DOI: https://doi.org/10.36557/2674-8169.2024v6n8p5403-5412
SILVA, T. dos S.; OLIVEIRA, M. D. P. de; BRASIL, L. G. O impacto da Síndrome do Ovário Policístico na vida das mulheres. Brazilian Journal of Health Review, [s. l.], v. 7, n. 5, p. e72576, maio. 2024. Disponível em: https://ojs.brazilianjournals.com.br/ojs/index.php/BJHR/article/view/72576 Acesso em: 8 fev. 2025. DOI: https://doi.org/10.34119/bjhrv7n5-075
SREEJITH, S.; KHANNA NEHEMIAH, H.; KANNAN, A. A clinical decision support system for polycystic ovarian syndrome using red deer algorithm and random forest classifier. Healthcare Analytics, [s. l.], v. 2, p. 100102, 2022. Disponível em: https://doi.org/10.1016/j.health.2022.100102 Acesso em: 8 fev. 2025. DOI: https://doi.org/10.1016/j.health.2022.100102
SUHA, S. A.; ISLAM, M. N. Exploring the dominant features and data-driven detection of polycystic ovary syndrome through modified stacking ensemble machine learning technique. Heliyon, [s. l.], v. 9, n. 3, p. e14518, Mar. 2023. Disponível em: https://www.sciencedirect.com/science/article/pii/S2405844023017255 Acesso em: 25 jan. 2025. DOI: https://doi.org/10.1016/j.heliyon.2023.e14518
SWAMY, S. R.; KS, N. P. Hybrid Machine Learning Model for Early Discovery and Prediction of Polycystic Ovary Syndrome. In: INTERNATIONAL CONFERENCE ON ADVANCED TECHNOLOGIES IN INTELLIGENT CONTROL, ENVIRONMENT, COMPUTING & COMMUNICATION ENGINEERING (ICATIECE), 2., 2022, Bangalore, India. Anais eletrônicos [...]. Bangalore: IEEE, 2022. p. 1-8. Disponível em: https://www.proceedings.com/67995.html Acesso em: 27 jan. 2025 DOI: https://doi.org/10.1109/ICATIECE56365.2022.10047488
TIWARI, S.; KANE, L.; KOUNDAL, D.; JAIN, A.; ALHUDHAIF, A.; POLAT, K.; ZAGUIA, A.; ALENEZI, F.; ALTHUBITI, S. A. SPOSDS: A smart Polycystic Ovary Syndrome diagnostic system using machine learning. Expert Systems with Applications, [s. l.], v. 203, p. 117592, 2022. Disponível em: https://www.sciencedirect.com/science/article/pii/S0957417422009046 Acesso em: 30 jan. 2025. DOI: https://doi.org/10.1016/j.eswa.2022.117592
