Proceedings of the Second International Conference on Emerging Trends in Engineering (ICETE 2023)

Impact of Various Data Splitting Ratios on the Performance of Machine Learning Models in the Classification of Lung Cancer

Authors
Archana Nazarkar1, *, Harish Kuchulakanti2, Chandra Sekhar Paidimarry3, Sravya Kulkarni2
1Department of Biomedical Engineering, B V Raju Institute of Technology, Narsapur, India
2Department of Biomedical Engineering, Osmania University, Hyderabad, India
3Department of Electronics and Communication Engineering, Osmania University, Hyderabad, India
*Corresponding author. Email: archana.n@bvrit.ac.in
Corresponding Author
Archana Nazarkar
Available Online 9 November 2023.
DOI
10.2991/978-94-6463-252-1_12How to use a DOI?
Keywords
Lung cancer (LC); Artificial Neural Network (ANN); Data Splitting Ratios
Abstract

Owing to revolutionary technological advancements and exceptional experimental data, particularly in the area of image analysis and processing, artificial intelligence (AI) and Machine Learning has lately become widely popular buzzword. This opportunity has been taken by medical specialties where imaging is essential, such as radiology, pathology, or cancer, and significant research and development efforts have been made to translate the promise of AI and ML into therapeutic applications. As these tools are increasingly being used for common medical imaging analytic tasks including diagnosis, segmentation, and classification. The four classifiers Artificial Neural Network (ANN), Support Vector Machine (SVM), Naïve Bayes (NB), and K Nearest Neighbour (KNN) are used in this study to classify lung cancer based on the features that are extracted from lung segmentation Algorithm. The feature data is estimated from 90 image sets and are combined for normalization and divided into training, validation, and testing sets with a ratio of 80:10:10. Different ratios (i.e., 80/20, 70/30, 60/40, 50/50) were used to divide the datasets into the training and the testing datasets to assess the model performance. ANN and KNN were very precise in achieving an accuracy of 99.8% with moderate and high training data.

Copyright
© 2023 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Download article (PDF)

Volume Title
Proceedings of the Second International Conference on Emerging Trends in Engineering (ICETE 2023)
Series
Advances in Engineering Research
Publication Date
9 November 2023
ISBN
10.2991/978-94-6463-252-1_12
ISSN
2352-5401
DOI
10.2991/978-94-6463-252-1_12How to use a DOI?
Copyright
© 2023 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Cite this article

TY  - CONF
AU  - Archana Nazarkar
AU  - Harish Kuchulakanti
AU  - Chandra Sekhar Paidimarry
AU  - Sravya Kulkarni
PY  - 2023
DA  - 2023/11/09
TI  - Impact of Various Data Splitting Ratios on the Performance of Machine Learning Models in the Classification of Lung Cancer
BT  - Proceedings of the Second International Conference on Emerging Trends in Engineering (ICETE 2023)
PB  - Atlantis Press
SP  - 96
EP  - 104
SN  - 2352-5401
UR  - https://doi.org/10.2991/978-94-6463-252-1_12
DO  - 10.2991/978-94-6463-252-1_12
ID  - Nazarkar2023
ER  -