Proceedings of the 2023 International Conference on Image, Algorithms and Artificial Intelligence (ICIAAI 2023)

Investigation Related to Performance of KNN, Logistic Regression and XGBoost on Diabetes Prediction

Authors
Jiaguo Lin1, *
1Seaver College, Pepperdine University, Malibu, 90263, United States
*Corresponding author. Email: jiaguo.lin@pepperdine.edu
Corresponding Author
Jiaguo Lin
Available Online 27 November 2023.
DOI
10.2991/978-94-6463-300-9_70How to use a DOI?
Keywords
Machine Learning; Algorithms; Diabetes
Abstract

This study uses three different machine learning algorithms to build model for diabetes prediction and compares the accuracy of each model, and these algorithms are K Nearest Neighbors (KNN), Logistic Regression, and Extreme Gradient Boosting (XGBoost). The goal for this study is to find a precise algorithm for diabetes prediction, and this is really conductive to diagnosis of diabetes for doctors. In this way, patients can get apt treatment on time. Before building models, the dataset is pre-processed by standard scaling and Synthetic Minority Over-sampling (SMOTE) to balance the class. Then, Grid Search CV is used to find the best parameter for the model. Finally, the results show that KNN has an accuracy of 82%, followed by XGBoost which is 79.87% and Logistic Regression which is 75.5%. The advantage of KNN algorithm is that it only considers the distance between training sample and the new sample that is going to be predicted without any other computation. As a result, KNN demonstrated the best performance among these three algorithms. In the future, this study can expand the size of the dataset and try more parameters in order to achieve a higher accuracy on the model for diabetes prediction.

Copyright
© 2023 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Download article (PDF)

Volume Title
Proceedings of the 2023 International Conference on Image, Algorithms and Artificial Intelligence (ICIAAI 2023)
Series
Advances in Computer Science Research
Publication Date
27 November 2023
ISBN
10.2991/978-94-6463-300-9_70
ISSN
2352-538X
DOI
10.2991/978-94-6463-300-9_70How to use a DOI?
Copyright
© 2023 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Cite this article

TY  - CONF
AU  - Jiaguo Lin
PY  - 2023
DA  - 2023/11/27
TI  - Investigation Related to Performance of KNN, Logistic Regression and XGBoost on Diabetes Prediction
BT  - Proceedings of the 2023 International Conference on Image, Algorithms and Artificial Intelligence (ICIAAI 2023)
PB  - Atlantis Press
SP  - 670
EP  - 676
SN  - 2352-538X
UR  - https://doi.org/10.2991/978-94-6463-300-9_70
DO  - 10.2991/978-94-6463-300-9_70
ID  - Lin2023
ER  -