Accuracy Analysis of K-Nearest Neighbor and NaÃ¯ve Bayes Algorithm in the Diagnosis of Breast Cancer

Irma Handayani; Ikrimach Ikrimach

doi:10.20895/infotel.v12i4.547

View PDF

Published Nov 29, 2020

DOI https://doi.org/10.20895/infotel.v12i4.547

Irma Handayani

Universitas Teknologi Yogyakarta

Ikrimach Ikrimach

Universitas Teknologi Yogyakarta

Abstract

In the medical field, there are many records of disease sufferers, one of which is data on breast cancer. An extraction process to fine information in previously unknown data is known as data mining. Data mining uses pattern recognition techniques such as statistics and mathematics to find patterns from old data or cases. One of the main roles of data mining is classification. In the classification dataset, there is one objective attribute or it can be called the label attribute. This attribute will be searched from new data on the basis of other attributes in the past. The number of attributes can affect the performance of an algorithm. This results in if the classification process is inaccurate, the researcher needs to double-check at each previous stage to look for errors. The best algorithm for one data type is not necessarily good for another data type. For this reason, the K-Nearest Neighbor and NaÃ¯ve Bayes algorithms will be used as a solution to this problem. The research method used was to prepare data from the breast cancer dataset, conduct training and test the data, then perform a comparative analysis. The research target is to produce the best algorithm in classifying breast cancer, so that patients with existing parameters can be predicted which ones are malignant and benign breast cancer. This pattern can be used as a diagnostic measure so that it can be detected earlier and is expected to reduce the mortality rate from breast cancer. By making comparisons, this method produces 95.79% for K-Nearest Neighbor and 93.39% for NaÃ¯ve Bayes

Downloads

Download data is not yet available.

How to Cite

[1]

I. Handayani and I. Ikrimach, “Accuracy Analysis of K-Nearest Neighbor and NaÃ¯ve Bayes Algorithm in the Diagnosis of Breast Cancer”, INFOTEL, vol. 12, no. 4, pp. 151-159, Nov. 2020.

Issue

Vol 12 No 4 (2020): November 2020

Section

Informatics

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Authors who publish with this journal agree to the following terms:

Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work

References

[1] G. I. Salama, M. B. Abdelhalim, and M. A. E. Zeid, â€œExperimental Comparison Of Classifiers For Breast Cancer Diagnosis,â€ Proc. - ICCES 2012 2012 Int. Conf. Comput. Eng. Syst., no. November, pp. 180â€“185, 2012.
[2] E. S. Wahyuni, â€œPenerapan Metode Seleksi Fitur Untuk Meningkatkan Hasil Diagnosis Kanker Payudara,â€ Simetris J. Tek. Mesin, Elektro dan Ilmu Komput., vol. 7, no. 1, p. 283, 2016.
[3] A. Buditjahjanto, â€œDetermination of the Type of Heart Syndrome in Traditional Chinese Medicine with the Bayesian Network Method,â€ J. Infotel, vol. 12, no. 2, pp. 32â€“38, 2020.
[4] F. Gemci and T. Ibrikci, â€œTumor Type Detection Using Naive Bayes Algorithm on Gene Expression Cancer RNA-Seq Data Set,â€International Conference on Engineering Technologies (ICENTE'17), 2017.
[5] B. SaÃ§lÄ± et al., â€œMicrowave dielectric property based classification of renal calculi: Application of a kNN algorithm,â€ Comput. Biol. Med., vol. 112, no. January, 2019.
[6] R. Shinde, S. Arjun, P. Patil, and P. J. Waghmare, â€œAn Intelligent Heart Disease Prediction System Using K-Means Clustering and NaÃ¯ve Bayes Algorithm,â€ Int. J. Comput. Sci. Inf. Technol., vol. 6, no. 1, pp. 637â€“639, 2015.
[7] N. Salmi and Z. Rustam, â€œNaÃ¯ve Bayes Classifier Models for Predicting the Colon Cancer,â€ IOP Conf. Ser. Mater. Sci. Eng., vol. 546, no. 5, 2019.
[8] H. Parveen and S. Pandey, â€œSentiment analysis on Twitter Data-set using Naive Bayes algorithm,â€ 2nd International Conference on Applied and Theoretical Computing and Communication Technology (iCATccT), Bangalore, 2016, pp. 416-419, 2016.
[9] Y. Ma, S. Liang, X. Chen and C. Jia, â€œThe Approach to Detect Abnormal Access Behavior Based on Naive Bayes Algorithm,â€ 2016 10th International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing (IMIS), Fukuoka, 2016, pp. 313-315.
[10] Y. Tan, â€œAn Improved KNN Text Classification Algorithm Based on K-Medoids and Rough Set,â€ 2018 10th International Conference on Intelligent Human-Machine Systems and Cybernetics (IHMSC), Hangzhou, 2018, pp. 109-113.
[11] I. Handayani, â€œApplication of K-Nearest Neighbor Algorithm on Classification of Disk Hernia and Spondylolisthesis in Vertebral Column,â€ Indones. J. Inf. Syst., vol. 2, no. 1, p. 57, 2019.
[12] D. A. Nasution, H. H. Khotimah, and N. Chamidah, â€œPerbandingan Normalisasi Data untuk Klasifikasi Wine Menggunakan Algoritma K-NN,â€ Comput. Eng. Sci. Syst. J., vol. 4, no. 1, p. 78, 2019.
[13] I. H. Witten, E. Frank, and M. a Hall, Data Mining: Practical Machine Learning Tools and Techniques, Third Edition (The Morgan Kaufmann Series in Data Managemeny Systems). Burlington: Elsevie, 2011.
[14] Trevor Hastie Robert TibshiraniJerome Friedman, â€œThe Elements of Statistical Learningâ€ (2nd en., web version),â€ Math. Intell., pp.269-370, 2008.
[15] K. Polat and S. GÃ¼neÅŸ, â€œBreast cancer diagnosis using least square support vector machine,â€ Digit. Signal Process. A Rev. J., vol. 17, no. 4, pp. 694â€“701, 2007.
[16] F. Gorunescu, Data Mining: Concept, Model and Techniques. Heidelberg, Berlin: Springer, 2011.

Article Sidebar

Main Article Content

Abstract

Downloads

Article Details

References