Breast cancer recurrence prediction system using k-nearest neighbor, naïve-bayes, and support vector machine algorithm
Main Article Content
Abstract
Breast cancer is a serious disease and one of the most fatal diseases in the world. Statistics show that breast cancer is the second common cancer worldwide with around two million new cases per year. Some research has been done related to breast cancer, and with the advancements of technology, breast cancer can be detected earlier by using artificial intelligence or machine learning. There are popular machine learning algorithms that can be used to predict the existence or recurrence of breast disease, for example, k-Nearest Neighbor (kNN), Naïve Bayes, and Support Vector Machine (SVM). This study aims to check the prediction of breast cancer recurrence using those three algorithms using the dataset available at the University of California, Irvine (UCI). The result shows that the kNN algorithm gives the best result in terms of accuracy to predict breast cancer recurrence.
Downloads
Article Details
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work
References
[2] A. G. Waks and E. P. Winer, “Breast Cancer Treatment: A Review,” JAMA - Journal of the American Medical Association. 2019.
[3] W. Bank, “Physicians (Per 1,000 People),” World Bank Report, 2020. [Online]. Available: https://data.worldbank.org/indicator/SH.MED.PHYS.ZS?most_recent_value_desc=true. [Accessed: 15-Feb-2021].
[4] Enriko, I. K. A., Suryanegara, M., & Gunawan, D, “Heart Disease Prediction System using k-Nearest Neighbor Algorithm with Simplified Patient’s Health Parameters,” J. Telecommun. Electron. Comput. Eng., vol. 8, no. 12, pp. 59–65, 2016.
[5] I. K. A. Enriko, M. Suryanegara, and D. Gunawan, “Heart disease diagnosis system with k-nearest neighbors method using real clinical medical records,” in ACM International Conference Proceeding Series, 2018.
[6] S. Mohan, C. Thirumalai, and G. Srivastava, “Effective heart disease prediction using hybrid machine learning techniques,” IEEE Access, 2019.
[7] D. Shetty, K. Rit, S. Shaikh, and N. Patil, “Diabetes disease prediction using data mining,” in Proceedings of 2017 International Conference on Innovations in Information, Embedded and Communication Systems, ICIIECS 2017, 2018.
[8] V. Chaurasia, S. Pal, and B. B. Tiwari, “Prediction of benign and malignant breast cancer using data mining techniques,” J. Algorithms Comput. Technol., 2018.
[9] H. Asri, H. Mousannif, H. Al Moatassime, and T. Noel, “Using Machine Learning Algorithms for Breast Cancer Risk Prediction and Diagnosis,” in Procedia Computer Science, 2016.
[10] X. Wu et al., “Top 10 algorithms in data mining,” Knowl. Inf. Syst., 2008.
[11] L. Peterson, “K-nearest neighbor,” Scholarpedia, 2009.
[12] D. Berrar, “Bayes’ theorem and naive bayes classifier,” in Encyclopedia of Bioinformatics and Computational Biology: ABC of Bioinformatics, 2018.
[13] R. Gandhi, “Support Vector Machine - Introduction to Machine Learning Algorithms,” Towards Data Science Tutorial, 2018. [Online]. Available: https://towardsdatascience.com/support-vector-machine-introduction-to-machine-learning-algorithms-934a444fca47. [Accessed: 24-Feb-2021].
[14] U. of C. Irvine, “Breast Cancer Data Set,” UCI Dataset Repository, 1988. [Online]. Available: https://archive.ics.uci.edu/ml/datasets/breast+cancer. [Accessed: 04-Jan-2021].
[15] E. Frank, M. Hall, L. Trigg, G. Holmes, and I. H. Witten, “Data mining in bioinformatics using Weka,” Bioinformatics, 2004.
[16] I. K. A. Enriko, M. Suryanegara, and D. Gunawan, “Comparative Study of Heart Disease Diagnosis Using Top Ten Data Mining Classification Algorithms,” J. Telecommun. Electron. Comput. Eng.