Accuracy Analysis of K-Nearest Neighbor and Naïve Bayes Algorithm in the Diagnosis of Breast Cancer
Main Article Content
Abstract
In the medical field, there are many records of disease sufferers, one of which is data on breast cancer. An extraction process to fine information in previously unknown data is known as data mining. Data mining uses pattern recognition techniques such as statistics and mathematics to find patterns from old data or cases. One of the main roles of data mining is classification. In the classification dataset, there is one objective attribute or it can be called the label attribute. This attribute will be searched from new data on the basis of other attributes in the past. The number of attributes can affect the performance of an algorithm. This results in if the classification process is inaccurate, the researcher needs to double-check at each previous stage to look for errors. The best algorithm for one data type is not necessarily good for another data type. For this reason, the K-Nearest Neighbor and Naïve Bayes algorithms will be used as a solution to this problem. The research method used was to prepare data from the breast cancer dataset, conduct training and test the data, then perform a comparative analysis. The research target is to produce the best algorithm in classifying breast cancer, so that patients with existing parameters can be predicted which ones are malignant and benign breast cancer. This pattern can be used as a diagnostic measure so that it can be detected earlier and is expected to reduce the mortality rate from breast cancer. By making comparisons, this method produces 95.79% for K-Nearest Neighbor and 93.39% for Naïve Bayes
Downloads
Article Details
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work
References
[2] E. S. Wahyuni, “Penerapan Metode Seleksi Fitur Untuk Meningkatkan Hasil Diagnosis Kanker Payudara,” Simetris J. Tek. Mesin, Elektro dan Ilmu Komput., vol. 7, no. 1, p. 283, 2016.
[3] A. Buditjahjanto, “Determination of the Type of Heart Syndrome in Traditional Chinese Medicine with the Bayesian Network Method,” J. Infotel, vol. 12, no. 2, pp. 32–38, 2020.
[4] F. Gemci and T. Ibrikci, “Tumor Type Detection Using Naive Bayes Algorithm on Gene Expression Cancer RNA-Seq Data Set,”International Conference on Engineering Technologies (ICENTE'17), 2017.
[5] B. Saçlı et al., “Microwave dielectric property based classification of renal calculi: Application of a kNN algorithm,” Comput. Biol. Med., vol. 112, no. January, 2019.
[6] R. Shinde, S. Arjun, P. Patil, and P. J. Waghmare, “An Intelligent Heart Disease Prediction System Using K-Means Clustering and Naïve Bayes Algorithm,” Int. J. Comput. Sci. Inf. Technol., vol. 6, no. 1, pp. 637–639, 2015.
[7] N. Salmi and Z. Rustam, “Naïve Bayes Classifier Models for Predicting the Colon Cancer,” IOP Conf. Ser. Mater. Sci. Eng., vol. 546, no. 5, 2019.
[8] H. Parveen and S. Pandey, “Sentiment analysis on Twitter Data-set using Naive Bayes algorithm,” 2nd International Conference on Applied and Theoretical Computing and Communication Technology (iCATccT), Bangalore, 2016, pp. 416-419, 2016.
[9] Y. Ma, S. Liang, X. Chen and C. Jia, “The Approach to Detect Abnormal Access Behavior Based on Naive Bayes Algorithm,” 2016 10th International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing (IMIS), Fukuoka, 2016, pp. 313-315.
[10] Y. Tan, “An Improved KNN Text Classification Algorithm Based on K-Medoids and Rough Set,” 2018 10th International Conference on Intelligent Human-Machine Systems and Cybernetics (IHMSC), Hangzhou, 2018, pp. 109-113.
[11] I. Handayani, “Application of K-Nearest Neighbor Algorithm on Classification of Disk Hernia and Spondylolisthesis in Vertebral Column,” Indones. J. Inf. Syst., vol. 2, no. 1, p. 57, 2019.
[12] D. A. Nasution, H. H. Khotimah, and N. Chamidah, “Perbandingan Normalisasi Data untuk Klasifikasi Wine Menggunakan Algoritma K-NN,” Comput. Eng. Sci. Syst. J., vol. 4, no. 1, p. 78, 2019.
[13] I. H. Witten, E. Frank, and M. a Hall, Data Mining: Practical Machine Learning Tools and Techniques, Third Edition (The Morgan Kaufmann Series in Data Managemeny Systems). Burlington: Elsevie, 2011.
[14] Trevor Hastie Robert TibshiraniJerome Friedman, “The Elements of Statistical Learning” (2nd en., web version),” Math. Intell., pp.269-370, 2008.
[15] K. Polat and S. Güneş, “Breast cancer diagnosis using least square support vector machine,” Digit. Signal Process. A Rev. J., vol. 17, no. 4, pp. 694–701, 2007.
[16] F. Gorunescu, Data Mining: Concept, Model and Techniques. Heidelberg, Berlin: Springer, 2011.