APPLICATION OF MACHINE LEARNING MODELS IN THE CONTEXT OF BOLSA FAMÍLIA: AN APPLIED STUDY IN RIO GRANDE DO NORTE AND PARAÍBA

Abstract

This study investigates the application of Machine Learning techniques to support decision-making in public administration, focusing on predicting family eligibility for the Bolsa Família Program in the Brazilian states of Rio Grande do Norte and Paraíba. De-identified microdata from the Cadastro Único database (2016–2018) were used to train and evaluate predictive models. After data preprocessing, class balancing with the Synthetic Minority Over-sampling Technique, and dimensionality reduction using SelectKBest, five ML models were implemented: K-Nearest Neighbors, Support Vector Machine, Random Forest, XGBoost, and a Recurrent Neural Network. The results show that tree-based models, neural networks, and Support Vector Machines achieve robust performance in both states, with accuracy values of up to 90\%. Random Forest, XGBoost, and Recurrent Neural Networks were more stable in RN, while Support Vector Machine achieved the best performance in PB, indicating regional differences in data separability. Feature selection effectively reduced model complexity without loss of accuracy, highlighting income, household structure, access to basic services, and family size as key determinants of eligibility. Overall, the findings confirm the feasibility of using ML models as decision-support tools for social policy management, contributing to more efficient monitoring and allocation of public resources. Despite limitations related to data availability and scope, this study provides empirical evidence of the potential of Artificial Intelligence to support evidence-based policymaking in a transparent and ethical manner.

Author Biographies

Luiz Fernando da Cunha Silva, Universidade Federal Rural do Semi-Árido (UFERSA)

Holds a Bachelor's degree in Information Systems from the Federal Rural University of the Semi-Arid Region (UFERSA). Works as a collaborating researcher at the Technological Institute of Aeronautics (ITA) in the area of ​​space research and participates in research projects at UFERSA and the Federal University of Paraíba (UFPB), including Women in STEM and DATALAB. His main areas of interest include Data Science, Artificial Intelligence, Machine Learning, Intelligent Systems, and Systems Development.

Maria Eduarda Bandeira Hora de Vasconcelos, Universidade Federal da Paraíba (UFPB)

Undergraduate student in Data Science and Artificial Intelligence at the Federal University of Paraíba (UFPB). Has knowledge in programming with Python, JavaScript, and C/C++. Develops projects that deliver relevant solutions through the analysis of real-world data and the application of machine learning models. Interested in the areas of artificial intelligence, databases, and software testing.

Verônica Maria Lima Silva, Universidade Federal da Paraíba (UFPB)

Holds a Bachelor’s degree in Computer Engineering from the Federal University of Ceará (2011). Since 2015, she has been a professor at the Federal Rural University of the Semi-arid Region (UFERSA) and earned a Ph.D. in Electrical Engineering from the Federal University of Campina Grande (UFCG) in 2019. Her research interests include digital systems, analog-to-digital converters, analog information converters, embedded systems, and artificial intelligence.

Samara Martins Nascimento Gonçalves, Universidade Federal Rural do Semi-Árido (UFERSA)

Holds a Ph.D. in Computer Science from the Federal University of Ceará. She is an Associate Professor at the Federal Rural University of the Semi-arid Region (UFERSA) and one of the leaders of the Software Innovations Laboratory (LIS) Research Groups. Her main research interests include Databases, Big Data, Data Streams, NoSQL Databases, Data Warehousing, Data Management, Systems Analysis, Software Quality, and Software Metrics.

References

ALSHEHHI, K.; CHEAITOU, A.; RASHID, H. Adoption Frameworks for Artificial Intelligence in the Public Sector: A Systematic Review of Literature. Proc. 3rd South Amer. Int. Ind. Eng. Oper. Manag. Conf, [s. l.], p. 919–929, 2022. DOI 10.46254/SA03.20220211. Disponível em: https://doi.org/10.46254/SA03.20220211 Acesso em: 8 abr. 2026.

AZEVEDO, C. S.; GONÇALVES, R. F.; GAVA, V. L.; SPINOLA, M. M. A Benford’s Law Based Methodology for fraud detection in social welfare programs: Bolsa Familia Analysis. Physica A: Statistical Mechanics and its Applications, [S. l.], v. 576, p. 125626, 2021. DOI 10.1016/j.physa.2020.125626. Disponível em: https://doi.org/10.1016/j.physa.2020.125626 Acesso em: 9 abr. 2026.

CAIZA, G. Navigating Governmental Choices: A Comprehensive Review of Artificial Intelligence’s Impact on Decision-Making. Informatics, [s. l.], v. 11, n. 64, ed. 3, 2024. DOI 10.3390/informatics11030064. Disponível em: https://doi.org/10.3390/informatics11030064 Acesso em: 11 abr. 2026.

CHAWLA, N. V.; BOWYER, K. W.; HALL, L. O.; KEGELMEYER, W. P. SMOTE: Synthetic Minority Over-sampling Technique. J. Artif. Intell. Res, [s. l.], v. 16, 2002. DOI 10.1613/jair.953. Disponível em: https://doi.org/10.1613/jair.953 Acesso em: 11 abr. 2026.

CHEN, T.; GUESTRIN, C. XGBoost: A Scalable Tree Boosting System. Proc. 22nd ACM SIGKDD Int. Conf. Knowl. Discov. Data Min, [s. l.], p. 785-794, 2016. DOI 10.1145/2939672.2939785. Disponível em: https://doi.org/10.1145/2939672.2939785 Acesso em: 16 abr. 2026.

DESORDI, Danubia; BONA, Carla Della. A inteligência artificial e a eficiência na administração pública. Revista de Direito, [S. l.], v. 12, n. 02, p. 01–22, 2020. Disponível em: https://periodicos.ufv.br/revistadir/article/view/9112 Acesso em: 9 abr. 2026 DOI: https://doi.org/10.32361/202012029112

FACELI, Katti; LORENA, Ana C.; GAMA, João; ALMEIDA, Tiago Agostinho De; CARVA, André C. P. L. F De. Inteligência Artificial - Uma Abordagem de Aprendizado de Máquina. 3. ed. Rio de Janeiro: LTC, 2025. E-book. p.iii. ISBN 9788521639213. Disponível em: https://app.minhabiblioteca.com.br/reader/books/9788521639213/ Acesso em: 12 abr. 2026.

GÉRON, Aurélien. Mãos à Obra: Aprendizado de Máquina com Scikit-Learn, Keras & TensorFlow. 2. ed. Rio de Janeiro: Alta Books, 2021. 640 p. ISBN 8550815489.

KOUTNÍK, J.; GREFF, K.; GOMEZ, F.; SCHMIDHUBER, J. A Clockwork RNN. Proc. 31st Int. Conf. Mach. Learn. (ICML), Beijing, v. 32, 2014. DOI 10.48550/arXiv.1402.3511. Disponível em: https://doi.org/10.48550/arXiv.1402.3511 Acesso em: 16 abr. 2026.

SUJON, K. M.; HASSAN, R.; CHOI, K.; SAMAD, M. A. Accuracy, precision, recall, f1-score, or MCC? empirical evidence from advanced statistics, ML, and XAI for evaluating business predictive models. Journal of Big Data, [s. l.], v. 12, n. 268, 2025. DOI 10.1186/s40537-025-01313-4. Disponível em: https://doi.org/10.1186/s40537-025-01313-4 Acesso em: 16 abr. 2026.

TAN, E. et al. Artificial intelligence and algorithmic decisions in fraud detection: An interpretive structural model. Data & Policy, Reino Unido, e25, ed. 5, p. 919–929, 2023. DOI 10.1017/dap.2023.22. Disponível em: https://doi.org/10.1017/dap.2023.22 Acesso em: 8 abr. 2026.

TISLENKO, M. D.; GAIDEL, A. V.; KUPRIYANOV, A. V. Comparison of feature selection algorithms for Data classification problems. 2022 VIII International Conference on Information Technology and Nan otechnology (ITNT), Samara, p. 1-5, 2022. DOI 10.1109/ITNT55410.2022.9848765. Disponível em: https://doi.org/10.1109/ITNT55410.2022.9848765 Acesso em: 16 abr. 2026.

ZENG, G. Invariance Properties and Evaluation Metrics Derived from the Confusion Matrix in Multiclass Classification. Mathematics, [s. l.], v. 13, ed. 16, p. 2609, 2025. DOI 10.3390/math13162609. Disponível em: https://doi.org/10.3390/math13162609 Acesso em: 16 abr. 2026.

How to Cite

Eduarda Bandeira Hora de Vasconcelos, M., Maria Lima Silva, V., & Martins Nascimento Gonçalves, S. (2026). APPLICATION OF MACHINE LEARNING MODELS IN THE CONTEXT OF BOLSA FAMÍLIA: AN APPLIED STUDY IN RIO GRANDE DO NORTE AND PARAÍBA (L. Fernando da Cunha Silva, Trans.). RECIMA21 - Revista Científica Multidisciplinar - ISSN 2675-6218, 7(6), e768077. https://doi.org/10.47820/recima21.v7i6.8077